Skip to content

Commit d202cfb

Browse files
authored
docs(readme): Fix links in README.md (#325)
1 parent 0b5b419 commit d202cfb

1 file changed

Lines changed: 23 additions & 22 deletions

File tree

README.md

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Gradients $\mathcal A_{\text{UPGrad}}$: it
4343
projects each gradient onto the dual cone, and averages the projections. This ensures that the
4444
update will always be beneficial to each individual objective (given a sufficiently small step
4545
size). In addition to $\mathcal A_{\text{UPGrad}}$, TorchJD supports
46-
[more than 10 aggregators from the literature](https://torchjd.org/docs/aggregation).
46+
[more than 10 aggregators from the literature](https://torchjd.org/stable/docs/aggregation).
4747

4848
## Installation
4949
<!-- start installation -->
@@ -58,7 +58,7 @@ The main way to use TorchJD is to replace the usual call to `loss.backward()` by
5858
`torchjd.backward` or `torchjd.mtl_backward`, depending on the use-case.
5959

6060
The following example shows how to use TorchJD to train a multi-task model with Jacobian descent,
61-
using [UPGrad](https://torchjd.org/docs/aggregation/upgrad/).
61+
using [UPGrad](https://torchjd.org/stable/docs/aggregation/upgrad/).
6262

6363
```diff
6464
import torch
@@ -103,33 +103,34 @@ using [UPGrad](https://torchjd.org/docs/aggregation/upgrad/).
103103
> In this example, the Jacobian is only with respect to the shared parameters. The task-specific
104104
> parameters are simply updated via the gradient of their task’s loss with respect to them.
105105
106-
More usage examples can be found [here](https://torchjd.org/examples/).
106+
More usage examples can be found [here](https://torchjd.org/stable/examples/).
107107

108108
## Supported Aggregators
109109
TorchJD provides many existing aggregators from the literature, listed in the following table.
110110

111111
<!-- recommended aggregators first, then alphabetical order -->
112-
| Aggregator | Publication |
113-
|----------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
114-
| [UPGrad](https://torchjd.org/docs/aggregation/upgrad/) (recommended) | [Jacobian Descent For Multi-Objective Optimization](https://arxiv.org/pdf/2406.16232) |
115-
| [AlignedMTL](https://torchjd.org/docs/aggregation/aligned_mtl/) | [Independent Component Alignment for Multi-Task Learning](https://arxiv.org/pdf/2305.19000) |
116-
| [CAGrad](https://torchjd.org/docs/aggregation/cagrad/) | [Conflict-Averse Gradient Descent for Multi-task Learning](https://arxiv.org/pdf/2110.14048) |
117-
| [ConFIG](https://torchjd.org/docs/aggregation/config/) | [ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks](https://arxiv.org/pdf/2408.11104) |
118-
| [Constant](https://torchjd.org/docs/aggregation/constant/) | - |
119-
| [DualProj](https://torchjd.org/docs/aggregation/dualproj/) | [Gradient Episodic Memory for Continual Learning](https://arxiv.org/pdf/1706.08840) |
120-
| [GradDrop](https://torchjd.org/docs/aggregation/graddrop/) | [Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout](https://arxiv.org/pdf/2010.06808) |
121-
| [IMTL-G](https://torchjd.org/docs/aggregation/imtl_g/) | [Towards Impartial Multi-task Learning](https://discovery.ucl.ac.uk/id/eprint/10120667/) |
122-
| [Krum](https://torchjd.org/docs/aggregation/krum/) | [Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent](https://proceedings.neurips.cc/paper/2017/file/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdf) |
123-
| [Mean](https://torchjd.org/docs/aggregation/mean/) | - |
124-
| [MGDA](https://torchjd.org/docs/aggregation/mgda/) | [Multiple-gradient descent algorithm (MGDA) for multiobjective optimization](https://www.sciencedirect.com/science/article/pii/S1631073X12000738) |
125-
| [Nash-MTL](https://torchjd.org/docs/aggregation/nash_mtl/) | [Multi-Task Learning as a Bargaining Game](https://arxiv.org/pdf/2202.01017) |
126-
| [PCGrad](https://torchjd.org/docs/aggregation/pcgrad/) | [Gradient Surgery for Multi-Task Learning](https://arxiv.org/pdf/2001.06782) |
127-
| [Random](https://torchjd.org/docs/aggregation/random/) | [Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning](https://arxiv.org/pdf/2111.10603) |
128-
| [Sum](https://torchjd.org/docs/aggregation/sum/) | - |
129-
| [Trimmed Mean](https://torchjd.org/docs/aggregation/trimmed_mean/) | [Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates](https://proceedings.mlr.press/v80/yin18a/yin18a.pdf) |
112+
| Aggregator | Publication |
113+
|-----------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
114+
| [UPGrad](https://torchjd.org/stable/docs/aggregation/upgrad/) (recommended) | [Jacobian Descent For Multi-Objective Optimization](https://arxiv.org/pdf/2406.16232) |
115+
| [AlignedMTL](https://torchjd.org/stable/docs/aggregation/aligned_mtl/) | [Independent Component Alignment for Multi-Task Learning](https://arxiv.org/pdf/2305.19000) |
116+
| [CAGrad](https://torchjd.org/stable/docs/aggregation/cagrad/) | [Conflict-Averse Gradient Descent for Multi-task Learning](https://arxiv.org/pdf/2110.14048) |
117+
| [ConFIG](https://torchjd.org/stable/docs/aggregation/config/) | [ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks](https://arxiv.org/pdf/2408.11104) |
118+
| [Constant](https://torchjd.org/stable/docs/aggregation/constant/) | - |
119+
| [DualProj](https://torchjd.org/stable/docs/aggregation/dualproj/) | [Gradient Episodic Memory for Continual Learning](https://arxiv.org/pdf/1706.08840) |
120+
| [GradDrop](https://torchjd.org/stable/docs/aggregation/graddrop/) | [Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout](https://arxiv.org/pdf/2010.06808) |
121+
| [IMTL-G](https://torchjd.org/stable/docs/aggregation/imtl_g/) | [Towards Impartial Multi-task Learning](https://discovery.ucl.ac.uk/id/eprint/10120667/) |
122+
| [Krum](https://torchjd.org/stable/docs/aggregation/krum/) | [Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent](https://proceedings.neurips.cc/paper/2017/file/f4b9ec30ad9f68f89b29639786cb62ef-Paper.pdf) |
123+
| [Mean](https://torchjd.org/stable/docs/aggregation/mean/) | - |
124+
| [MGDA](https://torchjd.org/stable/docs/aggregation/mgda/) | [Multiple-gradient descent algorithm (MGDA) for multiobjective optimization](https://www.sciencedirect.com/science/article/pii/S1631073X12000738) |
125+
| [Nash-MTL](https://torchjd.org/stable/docs/aggregation/nash_mtl/) | [Multi-Task Learning as a Bargaining Game](https://arxiv.org/pdf/2202.01017) |
126+
| [PCGrad](https://torchjd.org/stable/docs/aggregation/pcgrad/) | [Gradient Surgery for Multi-Task Learning](https://arxiv.org/pdf/2001.06782) |
127+
| [Random](https://torchjd.org/stable/docs/aggregation/random/) | [Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning](https://arxiv.org/pdf/2111.10603) |
128+
| [Sum](https://torchjd.org/stable/docs/aggregation/sum/) | - |
129+
| [Trimmed Mean](https://torchjd.org/stable/docs/aggregation/trimmed_mean/) | [Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates](https://proceedings.mlr.press/v80/yin18a/yin18a.pdf) |
130130

131131
The following example shows how to instantiate
132-
[UPGrad](https://torchjd.org/docs/aggregation/upgrad/) and aggregate a simple matrix `J` with it.
132+
[UPGrad](https://torchjd.org/stable/docs/aggregation/upgrad/) and aggregate a simple matrix `J` with
133+
it.
133134
```python
134135
from torch import tensor
135136
from torchjd.aggregation import UPGrad

0 commit comments

Comments
 (0)