Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 #2462

Closed
wants to merge 2 commits into from

Conversation

skykongkong8
Copy link
Member

  • According to recent papers in mixed precision, when it comes to computation of statistics in bn layer, we should use fp32 values
  • input(fp16) <-> BN_layer(fp16 weight, but compute with fp32 and re-cast to fp16) <-> output(fp16)
  • Although we need a bulky code block for this, I believe we can revisit here for cleaner code when TensorV2 becomes official

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

@taos-ci
Copy link

taos-ci commented Feb 6, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2462. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Comment on lines 179 to 182
TensorDim mu_dim = mu.getDim();
mu_dim.setDataType(ml::train::TensorDim::DataType::FP32);
Tensor mu32(mu_dim, true);
mu32.copyData(mu);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be reduced?

Tensor getFloatTensor(Tensor input) {
  Tensor output({input.getFormat(), ml::train::TensorDim::DataType::FP32}, true);
  ouput.copyData(input);
  return output;
}

Tensor mu32 = getFloatTensor(mu);
Tensor var32 = getFloatTensor(var);
Tensor gamma = getFloatTensor(gamma);
...

@@ -213,42 +306,151 @@ void BatchNormalizationLayer::calcDerivative(RunLayerContext &context) {

Tensor &t_reduced = context.getTensor(wt_idx[BNParams::t_reduced]);
Tensor &t_full = context.getTensor(wt_idx[BNParams::t_full]);
if (deriv.getDataType() == ml::train::TensorDim::DataType::FP16) {
#ifdef ENABLE_FP16
TensorDim gamma_dim = gamma.getDim();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied!

- According to recent papers in mixed precision, when it comes to computation of statistics in bn layer, we should use fp32 values
- input(fp16) <-> BN_layer(fp16 weight, but compute with fp32 and re-cast to fp16) <-> output(fp16)
- Although we need a bulky code block for this, I believe we can revisit here for cleaner code when TensorV2 becomes official

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
@skykongkong8 skykongkong8 changed the title [ DRAFT ] [ layer ] Mixed precision forwarding / backwarding for bn layer [ layer ] Mixed precision forwarding / backwarding for bn layer Feb 26, 2024
- For mixed precision training computation of BN layer, we have been declaring temporal single-precision Tensors, compute it with Tensor ops, and save the result in previous half-precision.
- To remove redundant code, declare temporal function and reuse.
- We should revisit here fore even clearer code when TensorV2 refactorization is finished

**Self evaluation:**
1. Build test:     [X]Passed [ ]Failed [ ]Skipped
2. Run test:     [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: skykongkong8 <[email protected]>
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@baek2sm baek2sm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DonghakPark DonghakPark changed the title [ layer ] Mixed precision forwarding / backwarding for bn layer [ layer ] Mixed precision forwarding / backwarding for bn layer @open sesame 03/07 10:42 Mar 7, 2024
/**
* This calculates dgamma tensor.
*/
Tensor dgamma = context.getWeightGrad(wt_idx[BNParams::gamma]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it Tensor &dgamma = context.getWeightGrad(wt_idx[BNParams::gamma]); ?

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Member

@SeoHyungjun SeoHyungjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@jijoongmoon jijoongmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is the right way to enable the mixed precision. please consider #2455

@skykongkong8
Copy link
Member Author

This PR is no longer needed. Close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants