You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I have a question on the discrete SAC design.
What was the reasoning for choosing the target entropy in the discrete SAC? If I understand correctly the target entropy represents the ideal entropy of the optimal policy. If so why it is -0.98 * log( 1 / |A|)?
The text was updated successfully, but these errors were encountered:
Hello! I have a question on the discrete SAC design.
What was the reasoning for choosing the target entropy in the discrete SAC? If I understand correctly the target entropy represents the ideal entropy of the optimal policy. If so why it is
-0.98 * log( 1 / |A|)
?The text was updated successfully, but these errors were encountered: