Skip to content

Actions: huggingface/trl

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
20,347 workflow runs
20,347 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Improve GRPO example
Secret Leaks #2304: Commit 2fdf2c0 pushed by lewtun
January 31, 2025 09:41 21s lewtun-patch-1
January 31, 2025 09:41 21s
⚠️ Fix Attention Masking in GRPO
Build PR Documentation #6326: Pull request #2708 synchronize by kashif
January 31, 2025 09:41 Action required andyl98:fix-grpo-logits-calc
January 31, 2025 09:41 Action required
⚠️ Fix Attention Masking in GRPO
Tests #7174: Pull request #2708 synchronize by kashif
January 31, 2025 09:41 Action required andyl98:fix-grpo-logits-calc
January 31, 2025 09:41 Action required
Upload PR Documentation
Upload PR Documentation #4626: completed by qgallouedec
January 31, 2025 09:33 31s
January 31, 2025 09:33 31s
📖 Add GRPOTrainer to README.md (#2713)
Build documentation #1091: Commit 265663a pushed by qgallouedec
January 31, 2025 09:30 3m 37s main
January 31, 2025 09:30 3m 37s
📖 Add GRPOTrainer to README.md (#2713)
Secret Leaks #2303: Commit 265663a pushed by qgallouedec
January 31, 2025 09:30 15s main
January 31, 2025 09:30 15s
📖 Add GRPOTrainer to README.md (#2713)
Tests #7172: Commit 265663a pushed by qgallouedec
January 31, 2025 09:30 24m 22s main
January 31, 2025 09:30 24m 22s
pages build and deployment
pages-build-deployment #1101: by qgallouedec
January 31, 2025 09:30 38s main
January 31, 2025 09:30 38s
📖 Add GRPOTrainer to README.md
Build PR Documentation #6324: Pull request #2713 synchronize by qgallouedec
January 31, 2025 09:30 3m 2s burtenshaw:patch-1
January 31, 2025 09:30 3m 2s
GRPO for RL on agent trajectories
Hugging Face Issue Labeler #73: Issue #2715 opened by korbinian-hoermann
January 31, 2025 09:09 51s
January 31, 2025 09:09 51s
Isn't the reward *minimized* when len(completion)==20 if this is the reward function?
Hugging Face Issue Labeler #72: Issue #2714 opened by cfpark00
January 31, 2025 09:03 22s
January 31, 2025 09:03 22s
fix: Fix typo in filename Update ultrafeedback.py (#2699)
Secret Leaks #2302: Commit 5ab15d3 pushed by qgallouedec
January 31, 2025 09:01 20s main
January 31, 2025 09:01 20s
fix: Fix typo in filename Update ultrafeedback.py (#2699)
Build documentation #1090: Commit 5ab15d3 pushed by qgallouedec
January 31, 2025 09:01 3m 27s main
January 31, 2025 09:01 3m 27s
fix: Fix typo in filename Update ultrafeedback.py (#2699)
Tests #7171: Commit 5ab15d3 pushed by qgallouedec
January 31, 2025 09:01 30m 58s main
January 31, 2025 09:01 30m 58s
fix: Fix typo in filename Update ultrafeedback.py (#2699)
Slow tests (on push) #479: Commit 5ab15d3 pushed by qgallouedec
January 31, 2025 09:01 21m 8s main
January 31, 2025 09:01 21m 8s
pages build and deployment
pages-build-deployment #1100: by qgallouedec
January 31, 2025 09:01 45s main
January 31, 2025 09:01 45s
Upload PR Documentation
Upload PR Documentation #4625: completed by burtenshaw
January 31, 2025 08:57 34s
January 31, 2025 08:57 34s
📖 Add GRPOTrainer to README.md
Build PR Documentation #6323: Pull request #2713 opened by burtenshaw
January 31, 2025 08:54 3m 30s burtenshaw:patch-1
January 31, 2025 08:54 3m 30s
Upload PR Documentation
Upload PR Documentation #4624: completed by brawncode
January 31, 2025 08:32 29s
January 31, 2025 08:32 29s
GRPO with tool calling
Hugging Face Issue Labeler #71: Issue #2712 opened by accupham
January 31, 2025 07:25 26s
January 31, 2025 07:25 26s
🔧 Optimize GRPO VRAM Usage by Computing Prompt Tokens Just Once
Build PR Documentation #6322: Pull request #2669 synchronize by andyl98
January 31, 2025 04:59 Action required andyl98:grpo-vram-optimization
January 31, 2025 04:59 Action required
🔧 Optimize GRPO VRAM Usage by Computing Prompt Tokens Just Once
Tests #7170: Pull request #2669 synchronize by andyl98
January 31, 2025 04:59 Action required andyl98:grpo-vram-optimization
January 31, 2025 04:59 Action required
LoRA 'trainable params: 0'
Hugging Face Issue Labeler #70: Issue #2711 opened by shannonruxin
January 31, 2025 04:50 28s
January 31, 2025 04:50 28s
Examples in training VDPO on llava1.6
Hugging Face Issue Labeler #69: Issue #2710 opened by lucasjinreal
January 31, 2025 04:22 42s
January 31, 2025 04:22 42s
GRPO memory bottleneck from num_generations in compute_loss
Hugging Face Issue Labeler #68: Issue #2709 opened by willccbb
January 31, 2025 03:54 40s
January 31, 2025 03:54 40s