-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU/OpenCL] Initial version of Reshape Layer with OpenCL ops #2651
Conversation
📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2651. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.
just wondering. why do we need this ReshapeLayer in the first place? |
std::string reshape_cl_kernel_ = | ||
R"(__kernel void reshape_cl(__global const float* input, | ||
__global float* output, | ||
const int batchsize, | ||
const int channels, | ||
const int height, | ||
const int width) { | ||
|
||
int elements_per_batch = channels * height * width; | ||
int global_id = get_global_id(0); | ||
int batch_index = global_id / elements_per_batch; | ||
int element_index = global_id % elements_per_batch; | ||
|
||
if (batch_index < batchsize) { | ||
int input_channel = element_index / (height * width); | ||
int input_height = (element_index % (height * width)) / width; | ||
int input_width = element_index % width; | ||
|
||
int input_index = batch_index * channels * height * width + | ||
input_channel * height * width + | ||
input_height * width + | ||
input_width; | ||
|
||
int output_index = batch_index * elements_per_batch + element_index; | ||
|
||
output[output_index] = input[input_index]; | ||
} | ||
})"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the whole process is simply copying, couldn't it be as follows?
int i= get_global_id(0);
output[i] = input[i]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I have tested and updated in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* @brief Process data and dimensions for reshape operation | ||
* @param[in] input Tensor | ||
* @param[in] result Tensor | ||
* @param[in] RunLayerContext reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the functions below, the same argument is written with a different explanation.
* @param[in] RunLayerContext reference | |
* @param[in] context RunLayerContext reference |
* @file unittest_layers_reshape_cl.cpp | ||
* @date 18th June 2024 | ||
* @brief Reshape Layer Test | ||
* @see https://github.com/nnstreamer/nntrainer | ||
* @author Niket Agarwal <[email protected]> | ||
* @bug No known bugs except for NYI items |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation of the code is slightly different.
In addition, it would be better to change the date format and the order of @brief for unity with other parts.
* @file unittest_layers_reshape_cl.cpp | |
* @date 18th June 2024 | |
* @brief Reshape Layer Test | |
* @see https://github.com/nnstreamer/nntrainer | |
* @author Niket Agarwal <[email protected]> | |
* @bug No known bugs except for NYI items | |
* @file unittest_layers_reshape_cl.cpp | |
* @date 18 June 2024 | |
* @see https://github.com/nnstreamer/nntrainer | |
* @author Niket Agarwal <niket.a@samsung.com> | |
* @bug No known bugs except for NYI items | |
* @brief Reshape Layer Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
std::string reshape_cl_kernel_ = | ||
R"(__kernel void reshape_cl(__global const float* input, | ||
__global float* output, | ||
const int batchsize, | ||
const int channels, | ||
const int height, | ||
const int width) { | ||
|
||
int elements_per_batch = channels * height * width; | ||
int global_id = get_global_id(0); | ||
int batch_index = global_id / elements_per_batch; | ||
int element_index = global_id % elements_per_batch; | ||
|
||
if (batch_index < batchsize) { | ||
int input_channel = element_index / (height * width); | ||
int input_height = (element_index % (height * width)) / width; | ||
int input_width = element_index % width; | ||
|
||
int input_index = batch_index * channels * height * width + | ||
input_channel * height * width + | ||
input_height * width + | ||
input_width; | ||
|
||
int output_index = batch_index * elements_per_batch + element_index; | ||
|
||
output[output_index] = input[input_index]; | ||
} | ||
})"; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kernel only holds for the case, where {dimsize,0,0}
.
If you intended this global-level parallelization only, @djeong20 's suggestion seems to be better.
If you find more generalized one, however, please update this kernel to use the indices you computed in this code block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with @djeong20's suggestion. Thanks.
ba6570e
to
0a7c9cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the hard work! One last suggestion is renaming the reshape_cl kernel as copy_cl. Since the kernel is basically copying data, I think copy_cl better reflects the function of the kernel (the final result would be ReshapeLayerCl
uses the copy_cl
kernel).
b9d7964
to
484f0f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.
Added naive version of OpenCL implementation for Reshape Layer. Incorporated kernel for ops used. Added unit test for Reshape_layer_cl. Signed-off-by: Niket Agarwal <[email protected]>
484f0f4
to
fc4c7b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@niket-agarwal, 💯 All CI checkers are successfully verified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! LGTM
LayerGoldenTestParamOptions::SKIP_CALC_DERIV | | ||
LayerGoldenTestParamOptions::SKIP_CALC_GRAD | | ||
LayerGoldenTestParamOptions::USE_INC_FORWARD, | ||
"nchw", "fp32", "fp32"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the fp16 test case and negative test case (maybe in a later PR?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Added initial version of Reshape Layer for GPU. This is a basic implementation using naive kernel.
Changes added with this PR:
reshape_cl.cpp
added containing the newReshapeLayerCL
class for OpenCL implementation.unittest_layers_reshape_cl.cpp
to test Reshape Layer on GPU.Signed-off-by: Niket Agarwal [email protected]