Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config file #153

Conversation

internetcoffeephone
Copy link
Contributor

Implemented the config file as mentioned in #151.

4 questions:

  • All run scripts work, except train_baseline_actions_dqn, I presume you're working on a private/unpushed ray branch?
  • Currently wondering: should we get rid of tf.app.flags and retrieve values from the dictionary created by config_parser directly?
  • tf.app.flags adds the benefit of command line arguments, are you currently using those?
  • Some parameters have not been implemented/are commented out, specifically the redis/memory/debug parameters. Either the debug flag is not needed anymore, or would you prefer debug-specific values for these parameters?

I haven't reproduced the results from the paper yet. Currently it takes me 6 days to take the 3e8 steps required per experiment - I'm in the process of requesting more powerful hardware.

internetcoffeephone and others added 11 commits May 8, 2019 22:04
Added localconfig requirement.
Many user-specific and experiment-specific parameters have been moved to the config file.
Renamed train scripts to have consistent names.
Now follows the convention: train_[experiment]_[algorithm].py
Experiment results are written to folders with the following naming convention:
[experiment]_[algorithm]_[environment], where environment is either cleanup or harvest.
Removed train_baseline as it is redundant with train_baseline_a3c.
Removed exp_name check, as this is now always handled by config_parser.
Simplified access to hyperparameters.
@internetcoffeephone
Copy link
Contributor Author

Additionally, I imagine some train_* files can be merged when parametrized, there's a lot of duplicate code in there. Separating the experiment categories (baseline, visible actions, influence, moa) from their algorithms (A3C, A2C, DQN) so they can vary independently would be ideal, although I'm not sure whether it's easy to do.

… file.

Curriculum is not used yet, as it does not cleanly map to a single parameter.
Made some numbers more clear (100e6 -> 1e8)
Renamed train_influence_a3c to train_influence_moa.
Renamed train_moa_a3c to train_moa_baseline.
@internetcoffeephone
Copy link
Contributor Author

These changes are all very outdated, and I'm getting rid of config_parser in my fork. Thus, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant