Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练参数求助 #433

Open
ltm920716 opened this issue Jan 17, 2025 · 0 comments
Open

训练参数求助 #433

ltm920716 opened this issue Jan 17, 2025 · 0 comments

Comments

@ltm920716
Copy link

hi,
我想基于qwen2.5-72b模型进行继续训练,请教一下训练参数的经验
1、什么样的数据量适合只训练sft,什么样的数据量适合continue pretrain
2、想达到一个较优的mfu,并行设置和数据相关的参数有什么推荐呀(在计算资源够用的情况下)

最近老板给了一个任务训练垂类数据,关于是否要进行continue pretrain还是直接sft,以及训练如何达到最好的mfu,这些初次接触还是有些不太好掌握,请各位大佬指点帮助,非常感谢~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant