-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add pets environments and reward functions #752
base: pytorch
Are you sure you want to change the base?
Conversation
|
||
|
||
@gin.configurable | ||
def reward_function_for_pendulum(obs, action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a reward function for pendulum? It seems you are trying to organizing all mbrl reward functions in a single file here.
|
||
@gin.configurable | ||
def reward_function_for_halfcheetah(obs, action): | ||
"""Function for computing reward for gym CartPole environment. It takes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CartPole -> halfcheetah
|
||
@gin.configurable | ||
def reward_function_for_pusher(obs, action): | ||
"""Function for computing reward for gym CartPole environment. It takes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CartPole -> Pusher
|
||
@gin.configurable | ||
def reward_function_for_reacher(obs, action): | ||
"""Function for computing reward for gym CartPole environment. It takes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CartPole -> Reacher
|
||
|
||
@gin.configurable | ||
def reward_function_for_cartpole(obs, action): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the reward functions, sometimes its association with the corresponding env/task is clear, such as pendulum, as that is a standard task from Gym.
Sometimes it might be necessary to make the association more explicit. For example,
The cartpole reward here is not for CartPole-v0
from Gym, which also a cartpole task but with discrete actions.
Similarly for others such the halfcheetah reward etc.
new_rot_axis, new_rot_perp_axis, cur_end + length * new_rot_axis | ||
|
||
cost = torch.sum( | ||
torch.square(cur_end - common.get_gym_env_attr('goal')), dim=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this way of retrieving the goal information still correct if we have multiple parallel environment?
It seems we are using
gym_env = _env.envs[0].gym
in get_gym_env_attr
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern
@@ -0,0 +1,95 @@ | |||
<!-- Cheetah Model | |||
|
|||
The state space is populated with joints in the order that they are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the xml files, not sure whether should check gym/mujoco license as well apart from reference to pets, if to include them.
Another possible way might be to provide pointers/scripts to download them?
|
||
from __future__ import division | ||
from __future__ import print_function | ||
from __future__ import absolute_import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future imports here and in other files can be removed
new_rot_axis, new_rot_perp_axis, cur_end + length * new_rot_axis | ||
|
||
cost = torch.sum( | ||
torch.square(cur_end - common.get_gym_env_attr('goal')), dim=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern
from gym.envs.mujoco import mujoco_env | ||
|
||
|
||
class CartpoleEnv(mujoco_env.MujocoEnv, utils.EzPickle): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need descriptions for all these new environments.
No description provided.