Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLZ make padding_free for DataCollatorForChatML. #2736

Open
YooSungHyun opened this issue Feb 2, 2025 · 1 comment
Open

PLZ make padding_free for DataCollatorForChatML. #2736

YooSungHyun opened this issue Feb 2, 2025 · 1 comment
Labels
✨ enhancement New feature or request 🏋 GKD Related to GKD 🙋 help from community wanted Open invitation for community members to contribute

Comments

@YooSungHyun
Copy link

Feature request

I want to implement GKD in a padding-free manner. (I really want to save memory to the extreme.)

The challenge is that GKD involves inference from either the student model or the teacher model. If we try to maximize sequence packing within a batch, aligning the shapes for loss computation is expected to become quite complex.

Could someone who is really skilled in algorithms create a solution that takes this into account?
I’m begging you—please! 🙏

Motivation

GKD consumes a significant amount of GPU memory, but implementing this approach could lead to improvements in both training speed and memory efficiency.

Your contribution

I tried to design the code, but since I'm not very smart, I couldn't come up with a good structure... I thought the only option was to ask someone for help.😥

@github-actions github-actions bot added 🏋 GKD Related to GKD ✨ enhancement New feature or request 🙋 help from community wanted Open invitation for community members to contribute labels Feb 2, 2025
@YooSungHyun
Copy link
Author

@qgallouedec

I saw that you implemented DPO in a padding-free manner—would you be able to contribute to GKD as well? I know it's a lot to ask, but I’d really appreciate it. 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🏋 GKD Related to GKD 🙋 help from community wanted Open invitation for community members to contribute
Projects
None yet
Development

No branches or pull requests

1 participant