-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama3.2 #290
Comments
minillm 需要 teacher model 和 student model 的 tokenization 保持一致。llama2 和 llama3.2 的 vocabulary 大小不同,有可能是这里出的问题。 可以尝试将 llama2-70B 换成 llama3.1-70B,保证和 student model tokenization 一致。 |
@t1101675 这个问题挺多的,不考虑兼容一下吗?或者提供个工具处理下 |
感谢回复! |
我们近期会考虑兼容一下 qwen2.5 词表大小不同的问题 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
以llama2-70b当教师模型,llama3.2-3b当学生模型,训练过程正常,loss稳步下降,但是用保存的模型推理,会出现大量不通顺的英文混杂中文和韩文,以及无法停止的问题,请问该如何解决?
The text was updated successfully, but these errors were encountered: