We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
大佬您好,我在跟着相关笔记学习,想问下,关于您笔记中的DDQN部分,对于目标价值Qtarget的计算,笔记中用的是使得产生交互行为的网络θ最优的行为a',再将a‘代入目标价值网络θ'中进行计算;有些地方我看到的是直接在目标价值网络θ'里面直接求最优的a'然后计算目标价值,不知道这样有什么不同
The text was updated successfully, but these errors were encountered:
No branches or pull requests
大佬您好,我在跟着相关笔记学习,想问下,关于您笔记中的DDQN部分,对于目标价值Qtarget的计算,笔记中用的是使得产生交互行为的网络θ最优的行为a',再将a‘代入目标价值网络θ'中进行计算;有些地方我看到的是直接在目标价值网络θ'里面直接求最优的a'然后计算目标价值,不知道这样有什么不同
The text was updated successfully, but these errors were encountered: