PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

heping103 · 2024-09-12T02:29:35Z

I customized an environment and trained it with the PPO algorithm，Why does my strategy suddenly collapse as the model is trained？
Is this a problem with my environment? Or is it a common problem in reinforcement learning? How do I fix it？Thank you for your teaching and look forward to receiving a response。

ericyangyu · 2024-10-01T02:30:02Z

Hi, thanks for reaching out. This can be many several reasons off the top of my head, but I cannot say much unless I know more about the task you want to train on. I know it's been a few weeks since you've posted this but if you still have questions on this, feel free to email me!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

heping103 commented Sep 12, 2024

ericyangyu commented Oct 1, 2024

PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

PPO Algorithm Convergence Issue: Ladder Degradation Problem #17

Comments

heping103 commented Sep 12, 2024

ericyangyu commented Oct 1, 2024