Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🥞 Fix DPO gradient accumulation loss scaling (#2615)
* fix DPO for gradient accumulation * Update trl/trainer/dpo_trainer.py * Update trl/trainer/dpo_trainer.py * Update trl/trainer/dpo_trainer.py --------- Co-authored-by: Quentin Gallouédec <[email protected]>
- Loading branch information