Adds a multiply_grads
akin to fairseq
#5161
Job | Run time |
---|---|
2m 18s | |
3m 0s | |
2m 39s | |
2m 24s | |
2m 19s | |
2m 19s | |
2m 20s | |
3m 41s | |
4m 1s | |
8m 30s | |
2m 32s | |
4m 8s | |
3m 26s | |
3m 6s | |
3m 11s | |
3m 19s | |
3m 2s | |
4m 37s | |
4m 19s | |
9m 55s | |
1h 15m 6s |