We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No description provided.
The text was updated successfully, but these errors were encountered:
-------------------------> Profiling Report <------------------------- Place: All Time unit: ms Sorted by total time in descending order in the same thread Event Calls Total CPU Time (Ratio) GPU Time (Ratio) Min. Max. Ave. Ratio. thread0::GpuMemcpySync:GPU->CPU 10 65.3952 39.898246 (0.610110) 25.496945 (0.389890) 6.46307 6.75515 6.53952 0.434411 thread0::fetch 10 42.5865 37.686449 (0.884939) 4.900031 (0.115061) 4.15256 4.90003 4.25865 0.282896 thread0::TensorCopySync:GPU->CPU 10 41.8076 37.494616 (0.896837) 4.313012 (0.103163) 4.13309 4.31301 4.18076 0.277722 thread0::abs 10 0.6688 0.450827 (0.674083) 0.217973 (0.325917) 0.052712 0.134069 0.06688 0.00444274 thread0::feed 10 0.079468 0.064448 (0.810993) 0.015020 (0.189007) 0.005744 0.01502 0.0079468 0.000527895 { name: "abs", device: "GPU", precision: { stable: "True", diff: 0.00000 }, speed: { repeat: 10, start: 1, end: 9, total: 5.08994, feed: 0.00000, compute: 0.00000, fetch: 0.00000 } }
feed数据的CPU->GPU传输,是在Executor里面设置feed数据时已经开始传输,不是在feed op里面传输的
fetch数据的GPU->CPU传输是发生在fetch op里面,最下面gpu操作结束之后,cuda_api这一层还有很长的时间。
Sorry, something went wrong.
No branches or pull requests
No description provided.
The text was updated successfully, but these errors were encountered: