Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 #2785

Merged
merged 4 commits into from
Dec 2, 2024

Conversation

EunjuYang
Copy link
Contributor

@EunjuYang EunjuYang commented Nov 4, 2024

  • This draft is a suggestion for [ GPU ] GPU Kernel creation time #2723
  • This draft splits kernel registration from forwarding function.
  • This draft make kernelPtr as static member of layer to avoid redundant kernel registration.
  • This draft contains example update for concat_cl , reshape_cl, and fc_layer_cl only.

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

@taos-ci
Copy link

taos-ci commented Nov 4, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2785. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

@taos-ci
Copy link

taos-ci commented Nov 4, 2024

:octocat: cibot: @EunjuYang, nntrainer/layers/cl_layers/concat_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

@EunjuYang EunjuYang force-pushed the gpu_layer_refactor branch 2 times, most recently from f43b253 to d48da33 Compare November 4, 2024 11:07
@taos-ci
Copy link

taos-ci commented Nov 4, 2024

:octocat: cibot: @EunjuYang, nntrainer/layers/cl_layers/layer_impl_cl.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

@EunjuYang EunjuYang changed the title [WIP/Draft] [ GPU/OpenCL ] Split kernel registration from forwarding method [ GPU/OpenCL ] Split kernel registration from forwarding method Nov 6, 2024
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@baek2sm baek2sm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. LGTM!

nntrainer/cl_context.cpp Show resolved Hide resolved
<< "OpenCL Error: Fail to register concat_cl_axis3_fp16 kernel";
layer_kernel_ptrs.emplace_back(kernel_concat_ptr);

return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question! can't ConcatLayerCl::registerClKernels() be called twice?
assume it is called for a second time, would it throw a runtime error or return true?

Copy link
Contributor Author

@EunjuYang EunjuYang Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed it is only called once in add_default_object, which is called by registerer ; the registerer is called once. However, it seems better to check it. I will update it.

ClContext &ClContext::Global() {
  static ClContext instance;

  // initializing commandqueue and context
  bool result = instance.clInit();

  if (!result) {
    ml_loge("cl_context: opencl command queue creation failed");
  }

  /// in g++ there is a bug that hangs up if caller throws,
  /// so registerer is noexcept although it'd better not
  /// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70298
  std::call_once(global_cl_context_init_flag, registerer, std::ref(instance));
  return instance;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a condition to check it in the registerClKernels() as well.

if (!layer_kernel_ptrs.empty())

nntrainer/layers/cl_layers/fc_layer_cl.h Show resolved Hide resolved
nntrainer/layers/cl_layers/fc_layer_cl.h Outdated Show resolved Hide resolved
nntrainer/layers/cl_layers/concat_cl.cpp Outdated Show resolved Hide resolved
@EunjuYang EunjuYang added the WIP label Nov 18, 2024
- This commit is draft
- This commit splits kernel registeration from forwarding function.
- This is WIP. This commit contains example update for concat_cl and
fc_layer_cl.

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
- This commit updates reshape_cl.cpp/.h to inherit LayerImplCl.
- This commit implements registerClKernels(), which is called in
context_cl.cpp
- update fc_layer_cl.h (removing redundant variable)
- update register_kernels only return true when all kernels are
successfully registered.
- add conditional code to check kernel is already registered

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Eunju Yang <[email protected]>
- This commit do a fp16-related bugfix in concat_cl.cpp.
- add condition `ENABLE_FP16`
- update __fp16 to _FP16

Signed-off-by: Eunju Yang <[email protected]>
@EunjuYang
Copy link
Contributor Author

📢 An additional commit to fix fp16-related issue in concat_cl is included : da18596
🩹 @djeong20 's recommendation is applied. Please review it and give me feedbacks. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

- clang format is applied.
- revert Android.mk
- fix bug in registerClKernels

Signed-off-by: Eunju Yang <[email protected]>
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@djeong20 djeong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the hard work! 👍

int dim = int(input1_batch_size * input1_width * input1_height *
(input1_channels + input2_channels));
int dim = int(input1_batch_size * input1_channels * input1_width *
(input1_height + input2_height));

opencl::Buffer inputA(cl_context_ref.context_inst_,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about change the oopencl::Buffer to take the Tensor itself? Then we can set clCreateBuffer depending on type and we do not need to consider the type here.

Buffer::Buffer(ContextManager &context_manager, int size_in_bytes,

Copy link
Contributor Author

@EunjuYang EunjuYang Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great ! It is required to support heterogeneous computing units with a common interface. I will create an issue and handle it in another PR! Thank you for your opinion :)

Copy link
Collaborator

@jijoongmoon jijoongmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jijoongmoon jijoongmoon changed the title [ GPU/OpenCL ] Split kernel registration from forwarding method [ GPU/OpenCL ] Split kernel registration from forwarding method @open sesame 12/02 09:39 Dec 2, 2024
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EunjuYang, 💯 All CI checkers are successfully verified. Thanks.

@jijoongmoon jijoongmoon merged commit 4b6776b into nnstreamer:main Dec 2, 2024
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants