-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thanks for wonderful projects ! Why I always got the results of apparent loss of original ability? #25
Comments
The final training loss is about 0.1-0.05 ,and I think it is might not be caused by overfitting ? |
Hi! Have you tried to directly finetune llama-3-8B-instruct? What will happen in this setting? |
OK,thanks! Could you show me some link as reference to figure out the problem? |
Certainly! Here is the link to Yi-9B https://huggingface.co/01-ai/Yi-9B and its tech report https://arxiv.org/pdf/2403.04652 |
Thanks ! |
I have post this new issue :hiyouga/LLaMA-Factory#3811 . Would you please help to explain ? Thanks! |
Using small datasets and large epochs in training can easily lead to overfitting. |
Thanks! |
After finetuned the llama-3-8B-instruct with the same configuration ,as the code from:https://github.com/hiyouga/LLaMA-Factory/tree/3df986c6793a51ec2cb5f31fd1808cd3a9883bc4/examples/extrasexamples/extras/llama_pro always leads to apparent loss of original ability? I only used the train datasets "Identity". Can you help? THANKS
The text was updated successfully, but these errors were encountered: