You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
options['gamma'] is simply one hyper parameter, you can set it to 1 or anything else. I don't remember whether options['gamma']=1 is optimal. Maybe you can tune it to make it better.
Thanks for the reply. Do you still have the optimal parameter for learning rate, momentum and decay ratio? Since I follow the learning rate which is 0.05, the loss become 20 in the early iteration which is not reasonable.
In
san_att_lstm_twolayer.py
, we can see that the learning rate is 0.05 initially.In function get_lr(), learning rate will dacay like this
However options['gamma'] is equals to 1, which means that the learning rate will not decay.
This is the part I'm wondering.
The text was updated successfully, but these errors were encountered: