Update train_agent.md #1237

TangLongbin · 2024-10-31T19:27:36Z

Description

In train_agent.md, the class BlackjackAgent needs env for initialization, but the demo code is initialized without env which results in errors. I added the parameters and use 'agent.env' instead of 'env' in the training.

No code for post-processing the training log (such as reward, episode length and training error) visualization. It should be average value rather value of each episode (gym.wrappers.RecordEpisodeStatistics only records this) which is not the same as figure provided in the doc. So I add the average value calculation (for later visualization) and the visualization code using matplotlib to help people get the result the expected.

Type of change

Please delete options that are not relevant.

Documentation only change (no code changed)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Fixed some one bug (agent initialized without `env`) and add the average value calculation (for later visualization).

pseudo-rnd-thoughts · 2024-11-01T16:44:07Z

@TangLongbin Could you summarise what you are fixing or solving in this PR?

TangLongbin · 2024-11-02T06:41:42Z

@TangLongbin Could you summarise what you are fixing or solving in this PR?

Sure! I remembered that i added description when pulling the request, but it's gone for some reasons. Sorry for that! I'll explain the modification here:

In train_agent.md, the class BlackjackAgent needs env for initialization, but the demo code is initialized without env which results in errors. I added the parameters and use 'agent.env' instead of 'env' in the training.
No code for post-processing the training log (such as reward, episode length and training error) visualization. It should be average value rather value of each episode (gym.wrappers.RecordEpisodeStatistics only records this) which is not the same as figure provided in the doc. So I add the average value calculation (for later visualization) and the visualization code using matplotlib to help people get the result the expected.

Update train_agent.md

1e2b215

Fixed some one bug (agent initialized without `env`) and add the average value calculation (for later visualization).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update train_agent.md #1237

Update train_agent.md #1237

TangLongbin commented Oct 31, 2024 •

edited by pseudo-rnd-thoughts

Loading

pseudo-rnd-thoughts commented Nov 1, 2024

TangLongbin commented Nov 2, 2024

Update train_agent.md #1237

Are you sure you want to change the base?

Update train_agent.md #1237

Conversation

TangLongbin commented Oct 31, 2024 • edited by pseudo-rnd-thoughts Loading

Description

Type of change

Screenshots

Checklist:

pseudo-rnd-thoughts commented Nov 1, 2024

TangLongbin commented Nov 2, 2024

TangLongbin commented Oct 31, 2024 •

edited by pseudo-rnd-thoughts

Loading