You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work! I'm updating some old code and would like to use TRL to finetune on the final turn of multi-turn conversations. The simple approaches I tried don't accomplish that.
Let's say I'd like to finetune meta-llama/Llama-3.2-1B-Instruct, though ideally without hardcoding templates for it. I'll keep my report here short since I expect there's a straightforward way I just can't find in the docs.
I've also tried without this trainset.map. When doing that, I attempted to set up a data collator, both DataCollatorForCompletionOnlyLM and DataCollatorForChatML, but faced different issues. The former has trouble locating the instruction_template and response_template and the latter expects examples before tokenization but receives them tokenized, so it crashes on inability to find the key messages in examples.
Appreciate your help!
The text was updated successfully, but these errors were encountered:
Thanks for your great work! I'm updating some old code and would like to use TRL to finetune on the final turn of multi-turn conversations. The simple approaches I tried don't accomplish that.
Let's say I'd like to finetune
meta-llama/Llama-3.2-1B-Instruct
, though ideally without hardcoding templates for it. I'll keep my report here short since I expect there's a straightforward way I just can't find in the docs.I've also tried without this
trainset.map
. When doing that, I attempted to set up a data collator, bothDataCollatorForCompletionOnlyLM
andDataCollatorForChatML
, but faced different issues. The former has trouble locating theinstruction_template
andresponse_template
and the latter expects examples before tokenization but receives them tokenized, so it crashes on inability to find the keymessages
in examples.Appreciate your help!
The text was updated successfully, but these errors were encountered: