-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Maze Dense Reward #175
Comments
|
Somewhat related: the description of the maze environments says |
Gymnasium-Robotics/gymnasium_robotics/envs/maze/maze_v4.py Lines 374 to 381 in 8606192
You are correct, can you make a PR to fix it? |
Question
Looking at the dense reward function for Maze Env:
return np.exp(-np.linalg.norm(desired_goal - achieved_goal))
The agent seems to prefer sitting the ball as close as possible to the goal without touching it after optimisation.
This makes sense given there is no bonus for reaching the reward and the reward is positive for all time steps.
Why is the dense reward formulated this way?
The text was updated successfully, but these errors were encountered: