generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🏆 Custom reward function for GRPO and shiny doc (#2606)
* initial commit * doc on custom reward function * test * doc doc doc * fix collator * style * links? * I need a docdoc 🎵 * fix link * I do like writing doc tbh * it takes time, but it's worth it * no return! * type hint * it's probably the best of both worlds [ci skip] * new doc before implementation * tests * more doc * style * multiple pretrained funcs * fix arg name * main? * example for R1 * fix script * clearer * import [ci skip] * Update docs/source/grpo_trainer.md Co-authored-by: lewtun <[email protected]> --------- Co-authored-by: lewtun <[email protected]>
- Loading branch information
1 parent
887c1f3
commit a1d2955
Showing
4 changed files
with
392 additions
and
57 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.