[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651

niket-agarwal · 2024-06-26T06:34:15Z

Added initial version of Reshape Layer for GPU. This is a basic implementation using naive kernel.

Changes added with this PR:

reshape_cl.cpp added containing the new ReshapeLayerCL class for OpenCL implementation.
Added unittest_layers_reshape_cl.cpp to test Reshape Layer on GPU.

Signed-off-by: Niket Agarwal [email protected]

taos-ci · 2024-06-26T06:34:19Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2651. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci

@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.

djeong20 · 2024-06-27T08:23:19Z

just wondering. why do we need this ReshapeLayer in the first place?
I think operations (e.g., forwarding, backwarding) would be a simple data copy that doesn't change any order of the data.

djeong20 · 2024-06-27T08:33:43Z

nntrainer/layers/cl_layers/reshape_cl.cpp

+std::string reshape_cl_kernel_ =
+  R"(__kernel void reshape_cl(__global const float* input, 
+                               __global float* output,
+                               const int batchsize, 
+                               const int channels, 
+                               const int height, 
+                               const int width) {
+
+    int elements_per_batch = channels * height * width;
+    int global_id = get_global_id(0);
+    int batch_index = global_id / elements_per_batch;
+    int element_index = global_id % elements_per_batch;
+
+    if (batch_index < batchsize) {
+        int input_channel = element_index / (height * width);
+        int input_height = (element_index % (height * width)) / width;
+        int input_width = element_index % width;
+
+        int input_index = batch_index * channels * height * width +
+                          input_channel * height * width +
+                          input_height * width +
+                          input_width;
+
+        int output_index = batch_index * elements_per_batch + element_index;
+
+        output[output_index] = input[input_index];
+    }
+})";


if the whole process is simply copying, couldn't it be as follows?

int i= get_global_id(0); output[i] = input[i]

Thanks for the suggestion! I have tested and updated in the latest commit.

SeoHyungjun

LGTM

SeoHyungjun · 2024-06-27T11:50:23Z

nntrainer/layers/cl_layers/reshape_cl.h

+   * @brief Process data and dimensions for reshape operation
+   * @param[in] input Tensor
+   * @param[in] result Tensor
+   * @param[in] RunLayerContext reference


In the functions below, the same argument is written with a different explanation.

Suggested change

* @param[in] RunLayerContext reference

* @param[in] context RunLayerContext reference

SeoHyungjun · 2024-06-27T11:56:33Z

test/unittest/layers/unittest_layers_reshape_cl.cpp

+ * @file unittest_layers_reshape_cl.cpp
+ * @date 18th June 2024
+ * @brief Reshape Layer Test
+ * @see	https://github.com/nnstreamer/nntrainer
+ * @author Niket Agarwal <[email protected]>
+ * @bug No known bugs except for NYI items


The indentation of the code is slightly different.
In addition, it would be better to change the date format and the order of @brief for unity with other parts.

Suggested change

* @file unittest_layers_reshape_cl.cpp

* @date 18th June 2024

* @brief Reshape Layer Test

* @see https://github.com/nnstreamer/nntrainer

* @author Niket Agarwal <[email protected]>

* @bug No known bugs except for NYI items

* @file unittest_layers_reshape_cl.cpp

* @date 18 June 2024

* @see https://github.com/nnstreamer/nntrainer

* @author Niket Agarwal <niket.a@samsung.com>

* @bug No known bugs except for NYI items

* @brief Reshape Layer Test

EunjuYang · 2024-07-01T02:02:06Z

nntrainer/layers/cl_layers/reshape_cl.cpp

+std::string reshape_cl_kernel_ =
+  R"(__kernel void reshape_cl(__global const float* input, 
+                               __global float* output,
+                               const int batchsize, 
+                               const int channels, 
+                               const int height, 
+                               const int width) {
+
+    int elements_per_batch = channels * height * width;
+    int global_id = get_global_id(0);
+    int batch_index = global_id / elements_per_batch;
+    int element_index = global_id % elements_per_batch;
+
+    if (batch_index < batchsize) {
+        int input_channel = element_index / (height * width);
+        int input_height = (element_index % (height * width)) / width;
+        int input_width = element_index % width;
+
+        int input_index = batch_index * channels * height * width +
+                          input_channel * height * width +
+                          input_height * width +
+                          input_width;
+
+        int output_index = batch_index * elements_per_batch + element_index;
+
+        output[output_index] = input[input_index];
+    }
+})";
+


This kernel only holds for the case, where {dimsize,0,0}.
If you intended this global-level parallelization only, @djeong20 's suggestion seems to be better.
If you find more generalized one, however, please update this kernel to use the indices you computed in this code block.

Updated with @djeong20's suggestion. Thanks.

taos-ci

@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.

djeong20

Appreciate the hard work! One last suggestion is renaming the reshape_cl kernel as copy_cl. Since the kernel is basically copying data, I think copy_cl better reflects the function of the kernel (the final result would be ReshapeLayerCl uses the copy_cl kernel).

taos-ci

@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.

Added naive version of OpenCL implementation for Reshape Layer. Incorporated kernel for ops used. Added unit test for Reshape_layer_cl. Signed-off-by: Niket Agarwal <[email protected]>

taos-ci

@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.

djeong20

Great work! LGTM

djeong20 · 2024-07-03T11:03:05Z

test/unittest/layers/unittest_layers_reshape_cl.cpp

+  LayerGoldenTestParamOptions::SKIP_CALC_DERIV |
+    LayerGoldenTestParamOptions::SKIP_CALC_GRAD |
+    LayerGoldenTestParamOptions::USE_INC_FORWARD,
+  "nchw", "fp32", "fp32");


Please add the fp16 test case and negative test case (maybe in a later PR?)

jijoongmoon

LGTM

niket-agarwal requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, helloahn, kparichay, gichan-jang, anyj0527, zhoonit, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8, djeong20, EunjuYang and a team as code owners June 26, 2024 06:34

github-actions bot added the Need Review label Jun 26, 2024

taos-ci approved these changes Jun 26, 2024

View reviewed changes

djeong20 reviewed Jun 27, 2024

View reviewed changes

SeoHyungjun approved these changes Jun 27, 2024

View reviewed changes

EunjuYang reviewed Jul 1, 2024

View reviewed changes

niket-agarwal force-pushed the gpu_reshape branch 2 times, most recently from ba6570e to 0a7c9cd Compare July 2, 2024 11:09

taos-ci approved these changes Jul 2, 2024

View reviewed changes

djeong20 reviewed Jul 2, 2024

View reviewed changes

djeong20 added the rebase required label Jul 2, 2024

niket-agarwal force-pushed the gpu_reshape branch 2 times, most recently from b9d7964 to 484f0f4 Compare July 3, 2024 06:58

taos-ci approved these changes Jul 3, 2024

View reviewed changes

[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops

fc4c7b3

Added naive version of OpenCL implementation for Reshape Layer. Incorporated kernel for ops used. Added unit test for Reshape_layer_cl. Signed-off-by: Niket Agarwal <[email protected]>

niket-agarwal force-pushed the gpu_reshape branch from 484f0f4 to fc4c7b3 Compare July 3, 2024 09:15

taos-ci approved these changes Jul 3, 2024

View reviewed changes

djeong20 approved these changes Jul 3, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Jul 3, 2024

jijoongmoon approved these changes Jul 3, 2024

View reviewed changes

jijoongmoon merged commit 46400ac into nnstreamer:main Jul 3, 2024
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651

[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651

niket-agarwal commented Jun 26, 2024

taos-ci commented Jun 26, 2024

taos-ci left a comment

djeong20 commented Jun 27, 2024

djeong20 Jun 27, 2024

niket-agarwal Jul 2, 2024

SeoHyungjun left a comment

SeoHyungjun Jun 27, 2024

SeoHyungjun Jun 27, 2024

niket-agarwal Jul 2, 2024

EunjuYang Jul 1, 2024

niket-agarwal Jul 2, 2024

taos-ci left a comment

djeong20 left a comment

taos-ci left a comment

taos-ci left a comment

djeong20 left a comment

djeong20 Jul 3, 2024

jijoongmoon left a comment

	* @param[in] RunLayerContext reference
	* @param[in] context RunLayerContext reference

[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651

[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651

Conversation

niket-agarwal commented Jun 26, 2024

taos-ci commented Jun 26, 2024

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 commented Jun 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SeoHyungjun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jijoongmoon left a comment

Choose a reason for hiding this comment