Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ blas/OpenCL ] SGEMM OpenCL kernels added #2648

Merged
merged 1 commit into from
Jul 4, 2024

Conversation

s-debadri
Copy link
Contributor

Added SGEMM OpenCL kernels for the following:

  • noTrans
  • transA
  • transB
  • transAB

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Debadri Samaddar [email protected]

@taos-ci
Copy link

taos-ci commented Jun 20, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2648. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s-debadri, 💯 All CI checkers are successfully verified. Thanks.

Comment on lines +199 to +206
GEN_TEST_INPUT(A, ((i * (batch * height * channel) + j * (batch * height) +
k * (width) + l + 1) %
MOD) *
alpha);
GEN_TEST_INPUT_B(B, ((i * (batch * height_b * channel) +
j * (batch * height_b) + k * (width_b) + l + 1) %
MOD) *
alpha);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it intended to skip testing on fp16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running tests on fp32 only as of now

@s-debadri s-debadri force-pushed the gpu_sgemm branch 2 times, most recently from 97cea3d to d156796 Compare July 2, 2024 08:39
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s-debadri, 💯 All CI checkers are successfully verified. Thanks.

@s-debadri s-debadri changed the title [ Wait For #2630 ][ blas/OpenCL ] SGEMM OpenCL kernels added [ blas/OpenCL ] SGEMM OpenCL kernels added Jul 3, 2024
Copy link
Contributor

@djeong20 djeong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -44,7 +44,7 @@ TEST(blas_kernels, dotCL_sgemv) {
int width = 768;

int height_b = 768;
int width_b = 96000;
int width_b = 2048;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any particular reason to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason as such. Was testing with different values. Might have changed it to run this test faster to check other test results quickly.

Added all possible OpenCL kernels for SGEMM
Added unit tests

Signed-off-by: Debadri Samaddar <[email protected]>
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s-debadri, 💯 All CI checkers are successfully verified. Thanks.

@jijoongmoon jijoongmoon merged commit 0d0ab71 into nnstreamer:main Jul 4, 2024
38 checks passed
@s-debadri s-debadri deleted the gpu_sgemm branch July 9, 2024 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants