Add support for data parallel QLoRA training via DeepSpeed Zero stages 0, 1 and 2. #11380
Job | Run time |
---|---|
26m 40s | |
5s | |
20m 37s | |
24m 55s | |
17m 40s | |
25m 36s | |
25m 0s | |
6m 57s | |
44m 52s | |
24m 11s | |
9m 15s | |
28m 26s | |
13m 35s | |
30m 6s | |
10m 14s | |
5h 8m 9s |
Job | Run time |
---|---|
26m 40s | |
5s | |
20m 37s | |
24m 55s | |
17m 40s | |
25m 36s | |
25m 0s | |
6m 57s | |
44m 52s | |
24m 11s | |
9m 15s | |
28m 26s | |
13m 35s | |
30m 6s | |
10m 14s | |
5h 8m 9s |