-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer Quality Target Change #171
Comments
SWG Notes: We intend to move to the quality target to 27. There is an AI to modify (and confirm) the reference reaches the target. |
SWG Notes: AI(Cray) - Check target quality on english to french and english to german. |
SWG Notes: (English to german) Published accuracy is 28.4; not able to hit 27 at the reference batch size yet; continuing parameter searching here. We expect reference to hit 27, but with changes to learning rate / batch size. (English to german) Google believes 27 can be hit at ~64k tokens global batch size. Above this, haven't been able to converge; but still exploring. Roughly doubles # of epochs versus 25. (English to french) published accuracy is 43... Google has seen around 41, but on going investigation. Continuing Cray AI. |
SWG Notes: We feel that variance is a concern here, especially at a target of 27. We'd like to increase accuracy, but want more information on variance to set the target. AI(Cray & Google & CISCO) -- Do a some runs to 26 to look at variance (and provide data for 25.5 too). |
I was able to get 8x transformer reference runs in and saw convergence to 26.0 on Eng-to-Germ within 5 epochs for 5/8 runs, and within 6 epochs for remaining 3. Here is the relevant grep from the logs: grep "Bleu score (uncased)" mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_*new/translation/logfile | grep ": 26" mlperf_translation_fp32_run_np1_bleu26_eng_to_germ_0_new/translation/logfile:Starting iteration 5 |
SWG Notes: No change to target accuracy for v0.6. We think for v0.7 we can move to target quality of 27 given more time to work on the issue. |
Active, moving to backlog. |
Note to follow up about the current transformer quality target (25->27?).
The text was updated successfully, but these errors were encountered: