Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU/OpenCL] Initial version of Addition Layer with OpenCL ops #2606

Merged
merged 3 commits into from
Jun 25, 2024

Conversation

yashSingh0723
Copy link
Contributor

@yashSingh0723 yashSingh0723 commented May 23, 2024

Added initial version of Addition Layer for GPU. This is a basic implementation using naive kernel.
Changes added with this PR:

  • addition_layer_cl.cpp added containing the new AdditionLayerCL class for OpenCL implementation.
  • Updated LayerKernel enum inside layer_context.h.
  • Added unittest_layers_addition_cl.cpp to test Addition Layer on GPU.

Signed-off-by: Yash Singh [email protected]

@taos-ci
Copy link

taos-ci commented May 23, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2606. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

addition_cl(data, rdata, size, context);

} else
throw std::invalid_argument("Error: OpenCL fp16 is not supported yet.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because we are not implementing FP16 yet. isn't it? If it is, then I think supporting FP16 is more important because the LLM is working based on FP16.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added FP16 support for Addition Layer in the latest commit.

input_step_dim.batch(1);
input_step_dim.height(to - from);

Tensor input_step = input_.getSharedDataTensor(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we do like this way, we need to data transfer from CPU to GPU. Doesn't it affect the latency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jijoongmoon This optimisation will be a part of future PRs.

@yashSingh0723 yashSingh0723 force-pushed the opencl_layer_addition branch from 5de81ff to a8f089f Compare May 27, 2024 06:59
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

@yashSingh0723 yashSingh0723 force-pushed the opencl_layer_addition branch from 9976775 to a98c244 Compare June 6, 2024 10:35
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@s-debadri s-debadri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yashSingh0723 yashSingh0723 force-pushed the opencl_layer_addition branch from cf43132 to 39bd0d9 Compare June 14, 2024 08:23
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Member

@myungjoo myungjoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723 LGTM. However, please address Jijoong's concern before merging.

Added naive version of OpenCL implementation for Addition Layer.
Incorporated kernel for ops used.
Added unit test for addition_layer_cl.

Signed-off-by: yash.singh <[email protected]>
Added addition kernel to enhance reusability of the common blas kernels.
Used AdditionLayer interface for both CPU and GPU calls.

Signed-off-by: yash.singh <[email protected]>

[GPU/OpenCL] Initial version of Addition Layer with OpenCL ops

Added naive version of OpenCL implementation for Addition Layer.
Incorporated kernel for ops used.
Added unit test for addition_layer_cl.

Signed-off-by: yash.singh <[email protected]>

[GPU/OpenCL] Addition Kernel added in reusable blas OpenCL kernels

Added addition kernel to enhance reusability of the common blas kernels.
Used AdditionLayer interface for both CPU and GPU calls.

Signed-off-by: yash.singh <[email protected]>
@yashSingh0723 yashSingh0723 force-pushed the opencl_layer_addition branch from 39bd0d9 to 3a17d5f Compare June 24, 2024 10:07
@taos-ci
Copy link

taos-ci commented Jun 24, 2024

:octocat: cibot: @yashSingh0723, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2606-202406241907050.10789704322815-3a17d5f84c1064d30aa730011032456d1efe5e94/.

Added fp16 support for Addition layer
Added unit tests for fp16 support
Updated the Layer Semantics for GPU

Signed-off-by: yash.singh <[email protected]>
@yashSingh0723 yashSingh0723 force-pushed the opencl_layer_addition branch from 3a17d5f to 5ad3e53 Compare June 24, 2024 11:42
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashSingh0723, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Collaborator

@jijoongmoon jijoongmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jijoongmoon jijoongmoon merged commit d0c6be9 into nnstreamer:main Jun 25, 2024
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants