Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about learning rate #4

Open
chingyaoc opened this issue Jun 30, 2016 · 2 comments
Open

Questions about learning rate #4

chingyaoc opened this issue Jun 30, 2016 · 2 comments

Comments

@chingyaoc
Copy link

In san_att_lstm_twolayer.py, we can see that the learning rate is 0.05 initially.
In function get_lr(), learning rate will dacay like this

options['lr'] * (options['gamma'] ** power)

However options['gamma'] is equals to 1, which means that the learning rate will not decay.

options['gamma'] = 1

This is the part I'm wondering.

@zcyang
Copy link
Owner

zcyang commented Jun 30, 2016

options['gamma'] is simply one hyper parameter, you can set it to 1 or anything else. I don't remember whether options['gamma']=1 is optimal. Maybe you can tune it to make it better.

@chingyaoc
Copy link
Author

Thanks for the reply. Do you still have the optimal parameter for learning rate, momentum and decay ratio? Since I follow the learning rate which is 0.05, the loss become 20 in the early iteration which is not reasonable.

options['lr'] = numpy.float32(0.05)
options['momentum'] = numpy.float32(0.9)
options['gamma'] = 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants