Skip to content
This repository has been archived by the owner on Feb 23, 2023. It is now read-only.

Why is the performance of BO so odd in the cartpole problem? #70

Open
yxchng opened this issue Feb 2, 2017 · 2 comments
Open

Why is the performance of BO so odd in the cartpole problem? #70

yxchng opened this issue Feb 2, 2017 · 2 comments

Comments

@yxchng
Copy link

yxchng commented Feb 2, 2017

screen shot 2017-02-02 at 2 22 43 pm

Attached is the learning performance of BO on cartpole. From what I understand from BO, the performance should not be so erratic such that after it "learnt" in time steps 38, it suddenly lost what it had learnt at about timesteps >150.

Is my understanding of BO wrong or is there something wrong with the implementation?

@javiergonzalezh
Copy link
Member

Hi,

I don't think I have enough info to help you on this one. Is this something you are trying with GPyOpt? Can you elaborate a bit more?

Javier

@yxchng
Copy link
Author

yxchng commented Feb 3, 2017

Basically, I am trying to apply BO on this problem:
https://gym.openai.com/envs/CartPole-v1

In other words, I am trying to find the parameters for a parameterized policy for cartpole problem.
Attached above is the performance of BO.

It seems to me that the performance is quite erratic and I am not sure BO should give such erratic behavior.

My understanding of BO tells me that it should not and I am not sure if I am misunderstanding it such that actually the behavior above is normal.

Thanks a lot for help.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants