Replies: 4 comments
-
Indeed, we need to call setSimTime. Keep in mind that this PyBullet DeepMimic implementation is still preliminary and under construction, so it still requires a bit more work until it is fully functional. Help is welcome though! |
Beta Was this translation helpful? Give feedback.
-
Okay, I can make a pull request regarding it. I have made the changes in the necessary places I believe. Also |
Beta Was this translation helpful? Give feedback.
-
Hi, @erwincoumans . I am also following the deep mimic training with pybullet. I directly use PPO algorithm provided by OpenAI baseline to train the pybullet humanoid to mimic the reference motion of walking, but the policy does not succeed in mimicking the reference motion. |
Beta Was this translation helpful? Give feedback.
-
I'm working on this on-and-off. All of it needs more work. |
Beta Was this translation helpful? Give feedback.
-
Hi,
In
GetReward()
function of scripthumanoid.py
, the reward is computed is computed by comparing the current pose of the agent with the _kinematicHumanoid Pose, where Pose of _kinematicHumanoid Pose is initialized from motion data.Now in
reset()
function ofhumanoid_deepmimic_gym_env.py
, one of the pose is randomly sampled from motion file and the agent is initialized in that pose, then the action is performed and dynamics are simulated for 8 steps (corresponding to 30Hz policy querying), which brings agent to a new state, but the SimTime is never updated so in reward computation, it is always compared with the same pose.I believe that it
setSimTime()
should be used in step function too. If this sounds right, I am happy to submit a pull request to rectify the issueBeta Was this translation helpful? Give feedback.
All reactions