Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a GPT-2 training example #19

Open
bwdGitHub opened this issue Dec 10, 2021 · 2 comments
Open

Add a GPT-2 training example #19

bwdGitHub opened this issue Dec 10, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@bwdGitHub
Copy link
Collaborator

We would like to use these issues to gauge user interest.

It is possible to use the GPT-2 implementation for further language model training. There is no example demonstrating this on the repo or otherwise.

To make this possible on a typical consumer GPU will likely require some technique to reduce the amount of GPU memory required to train. There are a number of options:

  1. Add support for a smaller GPT-2 model.
  2. Only train a subset of the GPT-2 parameters.
  3. Use gradient accumulation.
  4. Gradient checkpointing.
  5. Reduced precision gradients.
@bwdGitHub bwdGitHub added the enhancement New feature or request label Dec 10, 2021
@misataguchi
Copy link
Contributor

I have received one inquiry for fine-tuning GPT-2.

@qwer1304
Copy link

I second for being able to fine-tune GPT-2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants