volcengine / verl Public

Notifications You must be signed in to change notification settings
Fork 255
Star 3k

Code
Issues 40
Pull requests 19
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: volcengine/verl

[Roadmap] veRL Development Roadmap

#22 opened Nov 22, 2024 by PeterSH6

Open 2

[RFC] Megatron-LM and MCore maintaining issues for veRL

#15 opened Nov 19, 2024 by PeterSH6

Open

Basic Tutorial: Adding a New LLM Inference/Serving Backend

#21 opened Nov 22, 2024 by PeterSH6

Open 1

Labels 15 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40 Open 53 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

kl is applied twice if kl_loss_coef >0 and kl_coef for GRPO

#265 opened Feb 13, 2025 by vermouth1992

FSDP model Parallel

#263 opened Feb 13, 2025 by AIRobotZhang

How long will it be before support for DeepSeek R1 is available?

#253 opened Feb 12, 2025 by zhangyipin

Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in Qwen2ForCausalLM is torch.float32.

#252 opened Feb 12, 2025 by OKC13

When will verl support deepseekv3

#251 opened Feb 12, 2025 by jameswu2014

Load checkpoint from default_local_dir & Save hdfs checkpoints

#250 opened Feb 12, 2025 by cliangyu

Does Verl Support for Running PPO or GRPO Algorithms on QWEN2.5 72B Model

#249 opened Feb 11, 2025 by none0663

CUDA Error Persists in Qwen GRPO Training Despite Setting VLLM_ATTENTION_BACKEND=XFORMERS

#246 opened Feb 11, 2025 by AIBionics

Having issues with vLLM for GRPO

#245 opened Feb 11, 2025 by 3rdAT

Running the GRPO program on multiple nodes causes it to hang.

#242 opened Feb 10, 2025 by zhilizju

CUDA error: an illegal memory access was encountered while increasing the maximum prompt length (max_prompt_length)

#241 opened Feb 10, 2025 by AIBionics

Support Generative Reward Model (GenRM) enhancement

New feature or request

good first issue

Good for newcomers

#229 opened Feb 9, 2025 by maksimstw

Will veRL support deepspeed?

#221 opened Feb 7, 2025 by albertcity

0.5B model OOM when Initializing actor_rollout

#217 opened Feb 6, 2025 by MiaoLu3

Support Training with Both Function-Based Reward and DPO Reward Simultaneously

#214 opened Feb 6, 2025 by lianghsun

Question about kl_penalty question

Further information is requested

#211 opened Feb 6, 2025 by StarDewXXX

Can we use VeRL to train the reward models enhancement

New feature or request

#197 opened Feb 4, 2025 by YSLIU627

Integrate Verl with 🤗 hub

#190 opened Feb 3, 2025 by NielsRogge

if rollout.n is doubled, will the samples used for training doubled too? question

Further information is requested

#180 opened Feb 1, 2025 by StarDewXXX

[Question] Is vLLMRollout.generate_sequences the right place to implement tool calling? enhancement

New feature or request

question

Further information is requested

vllm related

#176 opened Jan 31, 2025 by accupham

Add a format issue template

#172 opened Jan 30, 2025 by vermouth1992

Add nightly ci to ensure accuracy

#171 opened Jan 30, 2025 by vermouth1992

Mulit-modal rl training support? enhancement

New feature or request

#168 opened Jan 30, 2025 by lucasjinreal

Add assertion to ensure the reward in GRPO is generated by ORM

#160 opened Jan 29, 2025 by vermouth1992

Any support for peft methods on PPO?

#159 opened Jan 29, 2025 by HaochenZhao

Previous 1 2 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly