[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

EunjuYang · 2024-11-04T10:54:22Z

This draft is a suggestion for [ GPU ] GPU Kernel creation time #2723
This draft splits kernel registration from forwarding function.
This draft make kernelPtr as static member of layer to avoid redundant kernel registration.
This draft contains example update for concat_cl , reshape_cl, and fc_layer_cl only.

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

taos-ci · 2024-11-04T10:54:25Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2785. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci · 2024-11-04T10:54:30Z

cibot: @EunjuYang, nntrainer/layers/cl_layers/concat_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

taos-ci · 2024-11-04T11:08:02Z

cibot: @EunjuYang, nntrainer/layers/cl_layers/layer_impl_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

baek2sm

Nice work. LGTM!

nntrainer/cl_context.cpp

djeong20 · 2024-11-14T23:35:55Z

nntrainer/layers/cl_layers/concat_cl.cpp

+    << "OpenCL Error: Fail to register concat_cl_axis3_fp16 kernel";
+  layer_kernel_ptrs.emplace_back(kernel_concat_ptr);
+
+  return true;


quick question! can't ConcatLayerCl::registerClKernels() be called twice?
assume it is called for a second time, would it throw a runtime error or return true?

I assumed it is only called once in add_default_object, which is called by registerer ; the registerer is called once. However, it seems better to check it. I will update it.

ClContext &ClContext::Global() { static ClContext instance; // initializing commandqueue and context bool result = instance.clInit(); if (!result) { ml_loge("cl_context: opencl command queue creation failed"); } /// in g++ there is a bug that hangs up if caller throws, /// so registerer is noexcept although it'd better not /// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70298 std::call_once(global_cl_context_init_flag, registerer, std::ref(instance)); return instance; }

I added a condition to check it in the registerClKernels() as well.

nntrainer/nntrainer/layers/cl_layers/reshape_cl.cpp

Line 55 in 6dc250f

if (!layer_kernel_ptrs.empty())

nntrainer/layers/cl_layers/fc_layer_cl.h

nntrainer/layers/cl_layers/concat_cl.cpp

- This commit is draft - This commit splits kernel registeration from forwarding function. - This is WIP. This commit contains example update for concat_cl and fc_layer_cl. Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>

- This commit updates reshape_cl.cpp/.h to inherit LayerImplCl. - This commit implements registerClKernels(), which is called in context_cl.cpp - update fc_layer_cl.h (removing redundant variable) - update register_kernels only return true when all kernels are successfully registered. - add conditional code to check kernel is already registered Self evaluation: Build test: [X]Passed [ ]Failed [ ]Skipped Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Eunju Yang <[email protected]>

- This commit do a fp16-related bugfix in concat_cl.cpp. - add condition `ENABLE_FP16` - update __fp16 to _FP16 Signed-off-by: Eunju Yang <[email protected]>

EunjuYang · 2024-11-20T02:54:29Z

📢 An additional commit to fix fp16-related issue in concat_cl is included : da18596
🩹 @djeong20 's recommendation is applied. Please review it and give me feedbacks. Thanks.

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

nntrainer/layers/cl_layers/concat_cl.cpp

- clang format is applied. - revert Android.mk - fix bug in registerClKernels Signed-off-by: Eunju Yang <[email protected]>

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

djeong20

Thank you for the hard work! 👍

jijoongmoon · 2024-11-26T23:30:21Z

nntrainer/layers/cl_layers/concat_cl.cpp

-    int dim = int(input1_batch_size * input1_width * input1_height *
-                  (input1_channels + input2_channels));
+    int dim = int(input1_batch_size * input1_channels * input1_width *
+                  (input1_height + input2_height));

    opencl::Buffer inputA(cl_context_ref.context_inst_,


How about change the oopencl::Buffer to take the Tensor itself? Then we can set clCreateBuffer depending on type and we do not need to consider the type here.

nntrainer/nntrainer/opencl/opencl_buffer.cpp

Line 30 in ff71fad

Buffer::Buffer(ContextManager &context_manager, int size_in_bytes,

Great ! It is required to support heterogeneous computing units with a common interface. I will create an issue and handle it in another PR! Thank you for your opinion :)

jijoongmoon

LGTM

taos-ci

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

github-actions bot added the Need Review label Nov 4, 2024

EunjuYang force-pushed the gpu_layer_refactor branch 2 times, most recently from f43b253 to d48da33 Compare November 4, 2024 11:07

taos-ci approved these changes Nov 4, 2024

View reviewed changes

EunjuYang force-pushed the gpu_layer_refactor branch from d48da33 to bff721b Compare November 5, 2024 11:45

taos-ci approved these changes Nov 5, 2024

View reviewed changes

EunjuYang changed the title ~~[WIP/Draft] [ GPU/OpenCL ] Split kernel registration from forwarding method~~ [ GPU/OpenCL ] Split kernel registration from forwarding method Nov 6, 2024

EunjuYang force-pushed the gpu_layer_refactor branch from bff721b to 4354f53 Compare November 6, 2024 03:27

taos-ci approved these changes Nov 6, 2024

View reviewed changes

EunjuYang marked this pull request as ready for review November 6, 2024 04:31

EunjuYang requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8 and djeong20 as code owners November 6, 2024 04:31

baek2sm approved these changes Nov 12, 2024

View reviewed changes

djeong20 reviewed Nov 14, 2024

View reviewed changes

EunjuYang added the WIP label Nov 18, 2024

EunjuYang added 3 commits November 20, 2024 10:38

[ FP16 ] Concat_cl fp16-related update

da18596

- This commit do a fp16-related bugfix in concat_cl.cpp. - add condition `ENABLE_FP16` - update __fp16 to _FP16 Signed-off-by: Eunju Yang <[email protected]>

EunjuYang force-pushed the gpu_layer_refactor branch from 4354f53 to da18596 Compare November 20, 2024 02:37

EunjuYang removed the WIP label Nov 20, 2024

EunjuYang force-pushed the gpu_layer_refactor branch from 6dc250f to e0bc0d8 Compare November 20, 2024 02:48

taos-ci approved these changes Nov 20, 2024

View reviewed changes

djeong20 reviewed Nov 20, 2024

View reviewed changes

nntrainer/layers/cl_layers/concat_cl.cpp Outdated Show resolved Hide resolved

[ trivial ] clang format & revert some tmp codes & fix bug

da83297

- clang format is applied. - revert Android.mk - fix bug in registerClKernels Signed-off-by: Eunju Yang <[email protected]>

EunjuYang force-pushed the gpu_layer_refactor branch from e0bc0d8 to da83297 Compare November 21, 2024 02:10

taos-ci approved these changes Nov 21, 2024

View reviewed changes

EunjuYang mentioned this pull request Nov 25, 2024

[ GPU ] split kernel registration from forwarding function in rmsnorm_layer_cl #2804

Merged

djeong20 approved these changes Nov 26, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Nov 26, 2024

jijoongmoon reviewed Nov 26, 2024

View reviewed changes

jijoongmoon approved these changes Dec 2, 2024

View reviewed changes

jijoongmoon changed the title ~~[ GPU/OpenCL ] Split kernel registration from forwarding method~~ [ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 Dec 2, 2024

taos-ci approved these changes Dec 2, 2024

View reviewed changes

jijoongmoon merged commit 4b6776b into nnstreamer:main Dec 2, 2024
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

EunjuYang commented Nov 4, 2024 •

edited

Loading

taos-ci commented Nov 4, 2024

taos-ci commented Nov 4, 2024

taos-ci commented Nov 4, 2024

taos-ci left a comment

taos-ci left a comment

taos-ci left a comment

baek2sm left a comment

djeong20 Nov 14, 2024

EunjuYang Nov 18, 2024 •

edited

Loading

EunjuYang Nov 20, 2024

EunjuYang commented Nov 20, 2024

taos-ci left a comment

taos-ci left a comment

djeong20 left a comment

jijoongmoon Nov 26, 2024

EunjuYang Nov 28, 2024 •

edited

Loading

jijoongmoon left a comment

taos-ci left a comment

[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

Conversation

EunjuYang commented Nov 4, 2024 • edited Loading

taos-ci commented Nov 4, 2024

taos-ci commented Nov 4, 2024

taos-ci commented Nov 4, 2024

taos-ci left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

baek2sm left a comment

Choose a reason for hiding this comment

djeong20 Nov 14, 2024

Choose a reason for hiding this comment

EunjuYang Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

EunjuYang Nov 20, 2024

Choose a reason for hiding this comment

EunjuYang commented Nov 20, 2024

taos-ci left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

jijoongmoon Nov 26, 2024

Choose a reason for hiding this comment

EunjuYang Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

jijoongmoon left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

EunjuYang commented Nov 4, 2024 •

edited

Loading

EunjuYang Nov 18, 2024 •

edited

Loading

EunjuYang Nov 28, 2024 •

edited

Loading