You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am a little confused about this equation: self.future_p = 1 - (1. / (1 + replay_k))
I think reply_k means that we want to select k transitions in one episode(50 transitions) for computing HER goals, but how dose future_p correspond to this? Can you give some interpretation? Thank you!
The text was updated successfully, but these errors were encountered:
Hello, I am a little confused about this equation:
self.future_p = 1 - (1. / (1 + replay_k))
I think
reply_k
means that we want to select k transitions in one episode(50 transitions) for computing HER goals, but how dosefuture_p
correspond to this? Can you give some interpretation? Thank you!The text was updated successfully, but these errors were encountered: