Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to converge on bandit tasks #57

Open
vzhuang opened this issue Feb 4, 2021 · 1 comment
Open

Fails to converge on bandit tasks #57

vzhuang opened this issue Feb 4, 2021 · 1 comment

Comments

@vzhuang
Copy link

vzhuang commented Feb 4, 2021

Using k=5, n=100, MAML fails to learn: average training and validation returns consistently hover around 50 throughout all 500 outer loop steps. Any possible discrepancies between this repo's code/config and the paper's experiments?

For reference, the following command

python train.py --config configs/maml/bandit/bandit-k5-n100.yaml --output-folder maml-bandit-k5-n100 --seed 1 --num-workers 10

produces the following average training/validation average returns for first and last 5 iterations respectively:

0 49.1 51.600002
1 45.5 47.75
2 49.449997 50.350002
3 49.65 52.2
4 50.4 52.7
...
495 46.6 50.0
496 50.150005 50.200005
497 53.100002 55.15
498 49.0 50.450005
499 44.5 47.600002
@tristandeleu
Copy link
Owner

The code did change significantly between the version we used in the paper and the current version (the paper was written on a very early version of the code, which probably got lost in the many refactoring we did even prior to open-sourcing the code). I haven't run bandit experiments since then on the new code unfortunately, I added the config files a few months ago after some request, but I haven't tried it myself. Unfortunately I don't know if the results should still hold with this version (I thought they would).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants