Add a GPT-2 training example #19

bwdGitHub · 2021-12-10T16:55:06Z

We would like to use these issues to gauge user interest.

It is possible to use the GPT-2 implementation for further language model training. There is no example demonstrating this on the repo or otherwise.

To make this possible on a typical consumer GPU will likely require some technique to reduce the amount of GPU memory required to train. There are a number of options:

Add support for a smaller GPT-2 model.
Only train a subset of the GPT-2 parameters.
Use gradient accumulation.
Gradient checkpointing.
Reduced precision gradients.

misataguchi · 2022-03-05T07:14:34Z

I have received one inquiry for fine-tuning GPT-2.

qwer1304 · 2022-12-19T08:36:59Z

I second for being able to fine-tune GPT-2.

bwdGitHub added the enhancement New feature or request label Dec 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a GPT-2 training example #19

Add a GPT-2 training example #19

bwdGitHub commented Dec 10, 2021

misataguchi commented Mar 5, 2022

qwer1304 commented Dec 19, 2022

Add a GPT-2 training example #19

Add a GPT-2 training example #19

Comments

bwdGitHub commented Dec 10, 2021

misataguchi commented Mar 5, 2022

qwer1304 commented Dec 19, 2022