-
Notifications
You must be signed in to change notification settings - Fork 255
Issues: volcengine/verl
Basic Tutorial: Adding a New LLM Inference/Serving Backend
#21
opened Nov 22, 2024 by
PeterSH6
Open
1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
kl is applied twice if kl_loss_coef >0 and kl_coef for GRPO
#265
opened Feb 13, 2025 by
vermouth1992
How long will it be before support for DeepSeek R1 is available?
#253
opened Feb 12, 2025 by
zhangyipin
Does Verl Support for Running PPO or GRPO Algorithms on QWEN2.5 72B Model
#249
opened Feb 11, 2025 by
none0663
CUDA Error Persists in Qwen GRPO Training Despite Setting VLLM_ATTENTION_BACKEND=XFORMERS
#246
opened Feb 11, 2025 by
AIBionics
Support Generative Reward Model (GenRM)
enhancement
New feature or request
good first issue
Good for newcomers
#229
opened Feb 9, 2025 by
maksimstw
Support Training with Both Function-Based Reward and DPO Reward Simultaneously
#214
opened Feb 6, 2025 by
lianghsun
Question about kl_penalty
question
Further information is requested
#211
opened Feb 6, 2025 by
StarDewXXX
Can we use VeRL to train the reward models
enhancement
New feature or request
#197
opened Feb 4, 2025 by
YSLIU627
if rollout.n is doubled, will the samples used for training doubled too?
question
Further information is requested
#180
opened Feb 1, 2025 by
StarDewXXX
[Question] Is vLLMRollout.generate_sequences the right place to implement tool calling?
enhancement
New feature or request
question
Further information is requested
vllm related
#176
opened Jan 31, 2025 by
accupham
Mulit-modal rl training support?
enhancement
New feature or request
#168
opened Jan 30, 2025 by
lucasjinreal
Add assertion to ensure the reward in GRPO is generated by ORM
#160
opened Jan 29, 2025 by
vermouth1992
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.