[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

skykongkong8 · 2024-02-06T01:46:44Z

According to recent papers in mixed precision, when it comes to computation of statistics in bn layer, we should use fp32 values
input(fp16) <-> BN_layer(fp16 weight, but compute with fp32 and re-cast to fp16) <-> output(fp16)
Although we need a bulky code block for this, I believe we can revisit here for cleaner code when TensorV2 becomes official

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

taos-ci · 2024-02-06T01:46:47Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2462. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

djeong20 · 2024-02-06T02:47:19Z

nntrainer/layers/bn_layer.cpp

+    TensorDim mu_dim = mu.getDim();
+    mu_dim.setDataType(ml::train::TensorDim::DataType::FP32);
+    Tensor mu32(mu_dim, true);
+    mu32.copyData(mu);


could be reduced?

Tensor getFloatTensor(Tensor input) { Tensor output({input.getFormat(), ml::train::TensorDim::DataType::FP32}, true); ouput.copyData(input); return output; } Tensor mu32 = getFloatTensor(mu); Tensor var32 = getFloatTensor(var); Tensor gamma = getFloatTensor(gamma); ...

djeong20 · 2024-02-06T03:45:40Z

nntrainer/layers/bn_layer.cpp

@@ -213,42 +306,151 @@ void BatchNormalizationLayer::calcDerivative(RunLayerContext &context) {

  Tensor &t_reduced = context.getTensor(wt_idx[BNParams::t_reduced]);
  Tensor &t_full = context.getTensor(wt_idx[BNParams::t_full]);
+  if (deriv.getDataType() == ml::train::TensorDim::DataType::FP16) {
+#ifdef ENABLE_FP16
+    TensorDim gamma_dim = gamma.getDim();


- According to recent papers in mixed precision, when it comes to computation of statistics in bn layer, we should use fp32 values - input(fp16) <-> BN_layer(fp16 weight, but compute with fp32 and re-cast to fp16) <-> output(fp16) - Although we need a bulky code block for this, I believe we can revisit here for cleaner code when TensorV2 becomes official **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

- For mixed precision training computation of BN layer, we have been declaring temporal single-precision Tensors, compute it with Tensor ops, and save the result in previous half-precision. - To remove redundant code, declare temporal function and reuse. - We should revisit here fore even clearer code when TensorV2 refactorization is finished **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

taos-ci

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

baek2sm

LGTM

EunjuYang · 2024-03-07T01:47:00Z

nntrainer/layers/bn_layer.cpp

+      /**
+       * This calculates dgamma tensor.
+       */
+      Tensor dgamma = context.getWeightGrad(wt_idx[BNParams::gamma]);


is it Tensor &dgamma = context.getWeightGrad(wt_idx[BNParams::gamma]); ?

taos-ci

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

SeoHyungjun

LGTM

jijoongmoon

I dont think this is the right way to enable the mixed precision. please consider #2455

skykongkong8 · 2024-04-22T23:53:53Z

This PR is no longer needed. Close.

skykongkong8 requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, helloahn, kparichay, gichan-jang, anyj0527, zhoonit, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, djeong20, EunjuYang and a team as code owners February 6, 2024 01:46

github-actions bot added the Need Review label Feb 6, 2024

skykongkong8 force-pushed the bn_layer_fp16 branch from 412da0f to 11006c0 Compare February 6, 2024 01:49

taos-ci approved these changes Feb 6, 2024

View reviewed changes

djeong20 reviewed Feb 6, 2024

View reviewed changes

skykongkong8 force-pushed the bn_layer_fp16 branch from 11006c0 to 534d25c Compare February 26, 2024 06:44

skykongkong8 changed the title ~~[ DRAFT ] [ layer ] Mixed precision forwarding / backwarding for bn layer~~ [ layer ] Mixed precision forwarding / backwarding for bn layer Feb 26, 2024

skykongkong8 force-pushed the bn_layer_fp16 branch from 534d25c to bd50fbe Compare February 26, 2024 06:44

skykongkong8 force-pushed the bn_layer_fp16 branch from bd50fbe to f48e585 Compare February 26, 2024 06:46

taos-ci approved these changes Feb 26, 2024

View reviewed changes

baek2sm approved these changes Feb 29, 2024

View reviewed changes

DonghakPark changed the title ~~[ layer ] Mixed precision forwarding / backwarding for bn layer~~ [ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 Mar 7, 2024

EunjuYang reviewed Mar 7, 2024

View reviewed changes

taos-ci approved these changes Mar 7, 2024

View reviewed changes

jijoongmoon mentioned this pull request Mar 8, 2024

Support mixed precision training @open sesame 03/08 07:57 #2455

Closed

SeoHyungjun approved these changes Apr 16, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Apr 16, 2024

jijoongmoon requested changes Apr 18, 2024

View reviewed changes

skykongkong8 closed this Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

skykongkong8 commented Feb 6, 2024

taos-ci commented Feb 6, 2024

taos-ci left a comment

djeong20 Feb 6, 2024

djeong20 Feb 6, 2024

skykongkong8 Feb 26, 2024

taos-ci left a comment

baek2sm left a comment

EunjuYang Mar 7, 2024

taos-ci left a comment

SeoHyungjun left a comment

jijoongmoon left a comment

skykongkong8 commented Apr 22, 2024

[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

Conversation

skykongkong8 commented Feb 6, 2024

taos-ci commented Feb 6, 2024

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 Feb 6, 2024

Choose a reason for hiding this comment

djeong20 Feb 6, 2024

Choose a reason for hiding this comment

skykongkong8 Feb 26, 2024

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

baek2sm left a comment

Choose a reason for hiding this comment

EunjuYang Mar 7, 2024

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

SeoHyungjun left a comment

Choose a reason for hiding this comment

jijoongmoon left a comment

Choose a reason for hiding this comment

skykongkong8 commented Apr 22, 2024