Skip to content

Actions: huggingface/trl

Build documentation

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
739 workflow runs
739 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix: typos in documentation files (#2804)
Build documentation #1120: Commit 55e680e pushed by qgallouedec
February 8, 2025 19:46 3m 34s main
February 8, 2025 19:46 3m 34s
⛰️ Reduce peak vram consumption with efficient selective log_softmax …
Build documentation #1119: Commit 09eefa7 pushed by qgallouedec
February 7, 2025 23:59 3m 28s main
February 7, 2025 23:59 3m 28s
➖ Fix GRPO example in README (#2800)
Build documentation #1118: Commit 7fdb69a pushed by qgallouedec
February 7, 2025 23:29 3m 45s main
February 7, 2025 23:29 3m 45s
🔬 SFT simplification (#2405)
Build documentation #1117: Commit 5b9236d pushed by qgallouedec
February 7, 2025 23:21 3m 28s main
February 7, 2025 23:21 3m 28s
📠 Log completions for GRPO (#2772)
Build documentation #1116: Commit 82d12eb pushed by qgallouedec
February 7, 2025 11:42 3m 31s main
February 7, 2025 11:42 3m 31s
🎯 [SFT] add token accuracy metric (#2597)
Build documentation #1115: Commit 84d73fd pushed by qgallouedec
February 7, 2025 10:09 3m 34s main
February 7, 2025 10:09 3m 34s
🆚 Distinguish padding and eos when they differ (#2793)
Build documentation #1114: Commit 2241f17 pushed by qgallouedec
February 7, 2025 10:08 3m 22s main
February 7, 2025 10:08 3m 22s
💡 Add 'Post training an LLM for reasoning with GRPO in TRL' tutorial …
Build documentation #1113: Commit 724acb9 pushed by qgallouedec
February 6, 2025 17:28 3m 53s main
February 6, 2025 17:28 3m 53s
Revert "Before the first training step, the model has no optimizer: f…
Build documentation #1112: Commit 7134a1e pushed by qgallouedec
February 6, 2025 17:20 3m 48s main
February 6, 2025 17:20 3m 48s
Before the first training step, the model has no optimizer: fix ds3
Build documentation #1111: Commit bf6e7ed pushed by qgallouedec
February 6, 2025 17:19 3m 53s main
February 6, 2025 17:19 3m 53s
🙃 Fix reward function in GRPO example (#2777)
Build documentation #1110: Commit e95f9fb pushed by qgallouedec
February 6, 2025 08:51 3m 33s main
February 6, 2025 08:51 3m 33s
💡 GRPO vram-efficiency improvement; only compute relevant logprobs (…
Build documentation #1109: Commit a85768f pushed by qgallouedec
February 6, 2025 07:52 3m 19s main
February 6, 2025 07:52 3m 19s
↔️ GRPO: Set max_model_len when initializing vLLM instance (#2728)
Build documentation #1108: Commit 78c5ce2 pushed by qgallouedec
February 5, 2025 23:12 3m 26s main
February 5, 2025 23:12 3m 26s
🚧 Add Optional ZeRO-3 Weight Gathering for GRPO in Sequence Generatio…
Build documentation #1107: Commit af4ad47 pushed by qgallouedec
February 4, 2025 22:24 3m 44s main
February 4, 2025 22:24 3m 44s
🔁 🦈 Support iterative GRPO (#2700)
Build documentation #1106: Commit b2ae999 pushed by qgallouedec
February 4, 2025 22:10 3m 28s main
February 4, 2025 22:10 3m 28s
🤖 Properly unwrap torch.compile-ed models in GRPO (#2750)
Build documentation #1105: Commit bd946f9 pushed by qgallouedec
February 4, 2025 21:22 3m 59s main
February 4, 2025 21:22 3m 59s
🔎 Add missing script argument in PPO documentation (#2720)
Build documentation #1104: Commit f42e34e pushed by qgallouedec
February 4, 2025 20:53 3m 59s main
February 4, 2025 20:53 3m 59s
📖 Clarification max len in Reward documentation (#2740)
Build documentation #1103: Commit 338fbd5 pushed by qgallouedec
February 4, 2025 20:16 4m 17s main
February 4, 2025 20:16 4m 17s
📐 Add vLLM dtype configuration for GRPO trainer (#2738)
Build documentation #1102: Commit 32f8fa8 pushed by qgallouedec
February 4, 2025 20:11 3m 56s main
February 4, 2025 20:11 3m 56s
📌 vLLM >= 0.7.1 for device fix (#2766)
Build documentation #1101: Commit 1a22764 pushed by qgallouedec
February 4, 2025 19:12 4m 9s main
February 4, 2025 19:12 4m 9s
💔 Decouple loss computing and generation in GRPO (#2762)
Build documentation #1100: Commit 1f344c9 pushed by qgallouedec
February 4, 2025 12:21 3m 33s main
February 4, 2025 12:21 3m 33s
🔂 Use vLLM prefix caching for speedup (#2757)
Build documentation #1099: Commit 85121fc pushed by qgallouedec
February 4, 2025 10:20 3m 44s main
February 4, 2025 10:20 3m 44s
⚠️ Fix attention masking in GRPO (#2708)
Build documentation #1098: Commit bbdd6db pushed by qgallouedec
February 2, 2025 19:44 3m 25s main
February 2, 2025 19:44 3m 25s
docs: Fix typos in alias descriptions (#2729)
Build documentation #1097: Commit 6e088d1 pushed by qgallouedec
February 2, 2025 10:59 3m 19s main
February 2, 2025 10:59 3m 19s
fix: Fix typo in filename in ultrafeedback-prompt.py (#2716)
Build documentation #1096: Commit a325a0e pushed by qgallouedec
February 1, 2025 13:53 3m 20s main
February 1, 2025 13:53 3m 20s