We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
You can continue the conversation there. Go to discussion →
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
利用模型生成的多次输出,并基于大模型完成优和差的筛选对比构建了dpo训练数据,训练后发现出现了重复(特别是训练数据出现的重复更多),看了看loss感觉不能保证模型安按照target进行输出。是否应该改进下目前的loss 例如 simpo orpo sigmoid 这些loss我感觉应该都加一个sft(ce loss)来保证输出不要偏离了
Put your message here.
No response
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
System Info
利用模型生成的多次输出,并基于大模型完成优和差的筛选对比构建了dpo训练数据,训练后发现出现了重复(特别是训练数据出现的重复更多),看了看loss感觉不能保证模型安按照target进行输出。是否应该改进下目前的loss 例如 simpo orpo sigmoid 这些loss我感觉应该都加一个sft(ce loss)来保证输出不要偏离了
Reproduction
Others
No response
The text was updated successfully, but these errors were encountered: