You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to use these issues to gauge user interest.
It is possible to use the GPT-2 implementation for further language model training. There is no example demonstrating this on the repo or otherwise.
To make this possible on a typical consumer GPU will likely require some technique to reduce the amount of GPU memory required to train. There are a number of options:
Add support for a smaller GPT-2 model.
Only train a subset of the GPT-2 parameters.
Use gradient accumulation.
Gradient checkpointing.
Reduced precision gradients.
The text was updated successfully, but these errors were encountered:
We would like to use these issues to gauge user interest.
It is possible to use the GPT-2 implementation for further language model training. There is no example demonstrating this on the repo or otherwise.
To make this possible on a typical consumer GPU will likely require some technique to reduce the amount of GPU memory required to train. There are a number of options:
The text was updated successfully, but these errors were encountered: