Skip to content

GRPO - Do not load reference model when beta == 0#2806

Open
ingambe wants to merge 7 commits intohuggingface:mainfrom ingambe:grpo-beta-ref-model