-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A tiny confusion about the ;state' concept #51
Comments
Hi. The 'state' is the global state containing all information in the game, that makes it a valid MDP. The observation of each agent should follow your environment design. Hope it helps. |
Thank you for your reply! Sorry for another bothering here: how can I most conveniently modify the algorithms (HASAC for example) into offline training? Is there any suggestion? |
These algorithms are not inherently designed for offline settings, so they do not curate large offline datasets or have conservative training constraints. I think you may need to design new algorithms and construct offline datasets to achieve offline training. |
I'm trying to fit my own env into the gym to use the algorithms, however the step() function needs to return a 'state'. Is this the global state that contain all the information in the game, or the shared observation that is visible to all agents? If it is the former, should I put the shared observation into each agent's own returning observation?
The text was updated successfully, but these errors were encountered: