Why is the performance of BO so odd in the cartpole problem? #70

yxchng · 2017-02-02T06:26:06Z

Attached is the learning performance of BO on cartpole. From what I understand from BO, the performance should not be so erratic such that after it "learnt" in time steps 38, it suddenly lost what it had learnt at about timesteps >150.

Is my understanding of BO wrong or is there something wrong with the implementation?

javiergonzalezh · 2017-02-02T21:17:18Z

Hi,

I don't think I have enough info to help you on this one. Is this something you are trying with GPyOpt? Can you elaborate a bit more?

Javier

yxchng · 2017-02-03T03:04:33Z

Basically, I am trying to apply BO on this problem:
https://gym.openai.com/envs/CartPole-v1

In other words, I am trying to find the parameters for a parameterized policy for cartpole problem.
Attached above is the performance of BO.

It seems to me that the performance is quite erratic and I am not sure BO should give such erratic behavior.

My understanding of BO tells me that it should not and I am not sure if I am misunderstanding it such that actually the behavior above is normal.

Thanks a lot for help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is the performance of BO so odd in the cartpole problem? #70

Why is the performance of BO so odd in the cartpole problem? #70

yxchng commented Feb 2, 2017

javiergonzalezh commented Feb 2, 2017

yxchng commented Feb 3, 2017

Why is the performance of BO so odd in the cartpole problem? #70

Why is the performance of BO so odd in the cartpole problem? #70

Comments

yxchng commented Feb 2, 2017

javiergonzalezh commented Feb 2, 2017

yxchng commented Feb 3, 2017