-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gpu/enhance] Shared memory implementation for Blas kernels #2784
base: main
Are you sure you want to change the base?
Conversation
Implementation of shared memory using proper flags. Added changes for Blas kernels. Signed-off-by: Debadri Samaddar <[email protected]>
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2784. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@s-debadri, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I have a simple question. All of these codes are running within a do-while statement, but why is it necessary to use a do-while loop even though the current while condition is always false?
We want the code to return at the point of failure and don't proceed any further. Hence, to avoid multiple nested if-else conditions we are using a do-while loop to make the code readable and clean. |
This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
Implementation of shared memory for Blas kernels.
Updates made:
clCreateBuffer
calls using wrapper constructorWriteData
callsCL_MEM_USE_HOST_PTR
flagMapBuffer
andUnMapBuffer
wrappers to ensure cache consistencySelf evaluation:
Signed-off-by: Debadri Samaddar [email protected]