Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new role "Tool" to support Llama 3.3 models #2371

Open
init27 opened this issue Feb 9, 2025 · 4 comments
Open

Adding new role "Tool" to support Llama 3.3 models #2371

init27 opened this issue Feb 9, 2025 · 4 comments
Assignees

Comments

@init27
Copy link

init27 commented Feb 9, 2025

Right now in chat/SFT datasets, supported roles are "[‘system’, ‘user’, ‘assistant’, ‘ipython’]".

With the latest Llama models 3.3 onwards, we have a new role: tool which serves the same purpose as "ipython"

It would be great to have this added to support fine-tuning for Tool-calling purposes using Torchtune.

Thanks in advance!

@RdoubleA
Copy link
Contributor

RdoubleA commented Feb 9, 2025

Yes, this is a leftover item we needed to do after the release of 3.3. Renaming ipython to tool is a lot more clear. Feel free to open a PR to make this change (replace all mentions of ipython role with tool in code and docs), otherwise we can keep this issue open until it gets picked up.

One thing to consider is that this would be BC breaking, so we should make the change in a way that doesn't break older versions or recipes.

@init27
Copy link
Author

init27 commented Feb 9, 2025

@RdoubleA Thanks for the reply!

Shouldn't we also keep the role ipython given that we might still want to keep support for 3.1 and 3.2 available. I leave the decision to the team ofcourse it might be better to just support the latest 70B however for the smaller model(s) we still need to keep the ipython role if I understand correctly.

@Ankur-singh
Copy link
Contributor

Thanks for pointing it out @init27.

After a brief investigation, I see that only Qwen2_5 and Mistral tokenizer and chat template that make user of ipython role. As tool calling is becoming more and more important, it would be great idea to generalize it and also add some example config.

One thing to consider is that this would be BC breaking, so we should make the change in a way that doesn't break older versions or recipes.

Totally! @RdoubleA I'll be happy to pick it up. Can you please assign it to me?

I'm working with this part of the codebase for the first time, might need some help.

@init27
Copy link
Author

init27 commented Feb 9, 2025

Thanks for taking a look, @Ankur-singh!

After a brief investigation, I see that only Qwen2_5 and Mistral tokenizer and chat template that make user of ipython role. As tool calling is becoming more and more important, it would be great idea to generalize it and also add some example config.

Llama 3.1 and 3.2 models do as well so it's an interesting split for 3.3 onwards using Tool

It might be really cool to include an example config using a dataset like Nous-func-calling for the demo config if you'd like.

You'd want to start adding the role from here and then also look at adding an example config for the demo.

Thanks again for picking this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants