[ util ] Implement softmax calculation function in util #2479

skykongkong8 · 2024-02-20T03:08:07Z

Current activation functions are implemented as a function template, and fully computes with the function template parameter's precision unless using NEON intrinsics with inter-fp32-values explicitly.
According to current papers, for safe convergence of mixed precision training, it is quite critical to calculate softmax with fp32 precision.
This PR proposes a SIMD version of softmax calculation, and uses temporally higher precision when using half-precision
For mathematical stability, applied linear translation (using minus values for the input of exponential function) to avoid precision overflow

taos-ci · 2024-02-20T03:08:09Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2479. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

djeong20

LGTM

myungjoo · 2024-02-23T07:03:50Z

nntrainer/utils/util_simd.cpp

+float max(const unsigned int N, float *X) {
+#ifdef USE_NEON
+  return nntrainer::neon::max(N, X);
+#else


This is for your future reference.

Use STL properly

std::vector<float> v (X, X+N); return *std::max_element(v.begin(), v.end());

And if you compile it properly, you may get x64/SIMD (AVX/SSE) for free:

https://stackoverflow.com/questions/59373900/why-is-there-no-simd-functionality-in-the-c-standard-library

Will apply this right away

- Current softmax implementation does not consider fp32 use in half-precision softmax - Implement raw, and neon-simd version of softmax function with fp32 and fp16 with fp32 accumulation **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

- Unlike isamax function of BLAS, this function returns the maximum 'value', not index - Note that this function is applicable only when the input data is continuous **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

- For numerical stability, using negative values for the input of exponential function is recommended. (since negative output will range from 0 to 1) - Subtract the maximum value before calculating exponential vectors **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

- Add exponential inplace function **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

- For cleaner code use std::max_element instead of for-loop **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

taos-ci

@skykongkong8, 💯 All CI checkers are successfully verified. Thanks.

baek2sm

LGTM

jijoongmoon

LGTM except minor comments.

jijoongmoon · 2024-03-05T01:18:11Z

nntrainer/utils/util_simd.cpp

+  unsigned int i = 0;
+  float sum = 0.f;
+  float max_x = max(N, X);
+  while (i < N) {


It might depend on the size of Matrix N, but you can also optimize further using omp and temporal buffer to save X[i] - max_x. You can optimize it later cause we also have to consider not using the NEON case.

skykongkong8 requested review from myungjoo, jijoongmoon, again4you and jaeyun-jung as code owners February 20, 2024 03:08

skykongkong8 requested review from leemgs, wooksong, helloahn, kparichay, gichan-jang, anyj0527, zhoonit, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, djeong20, EunjuYang and a team as code owners February 20, 2024 03:08

github-actions bot added the Need Review label Feb 20, 2024

skykongkong8 force-pushed the softmax_mp branch 2 times, most recently from 8c72f55 to c87da5a Compare February 20, 2024 03:18

taos-ci approved these changes Feb 20, 2024

View reviewed changes

djeong20 approved these changes Feb 21, 2024

View reviewed changes

myungjoo reviewed Feb 23, 2024

View reviewed changes

myungjoo approved these changes Feb 23, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Feb 23, 2024

skykongkong8 force-pushed the softmax_mp branch from 94d3328 to 400b6ae Compare February 26, 2024 00:07

skykongkong8 added 5 commits February 26, 2024 09:15

[ util ] Implement exp_i function

55cb6e8

- Add exponential inplace function **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

[ util ] Use proper STL in max element comparison

e128e06

- For cleaner code use std::max_element instead of for-loop **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>

skykongkong8 force-pushed the softmax_mp branch from 400b6ae to e128e06 Compare February 26, 2024 00:15

taos-ci approved these changes Feb 26, 2024

View reviewed changes

baek2sm approved these changes Feb 29, 2024

View reviewed changes

jijoongmoon approved these changes Mar 5, 2024

View reviewed changes

jijoongmoon reviewed Mar 5, 2024

View reviewed changes

jijoongmoon merged commit d718ede into nnstreamer:main Mar 5, 2024
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ util ] Implement softmax calculation function in util #2479

[ util ] Implement softmax calculation function in util #2479

skykongkong8 commented Feb 20, 2024 •

edited

Loading

taos-ci commented Feb 20, 2024

taos-ci left a comment

djeong20 left a comment

myungjoo Feb 23, 2024 •

edited

Loading

skykongkong8 Feb 25, 2024

taos-ci left a comment

baek2sm left a comment

jijoongmoon left a comment •

edited

Loading

jijoongmoon Mar 5, 2024 •

edited

Loading

[ util ] Implement softmax calculation function in util #2479

[ util ] Implement softmax calculation function in util #2479

Conversation

skykongkong8 commented Feb 20, 2024 • edited Loading

taos-ci commented Feb 20, 2024

taos-ci left a comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

myungjoo Feb 23, 2024 • edited Loading

Choose a reason for hiding this comment

skykongkong8 Feb 25, 2024

Choose a reason for hiding this comment

taos-ci left a comment

Choose a reason for hiding this comment

baek2sm left a comment

Choose a reason for hiding this comment

jijoongmoon left a comment • edited Loading

Choose a reason for hiding this comment

jijoongmoon Mar 5, 2024 • edited Loading

Choose a reason for hiding this comment

skykongkong8 commented Feb 20, 2024 •

edited

Loading

myungjoo Feb 23, 2024 •

edited

Loading

jijoongmoon left a comment •

edited

Loading

jijoongmoon Mar 5, 2024 •

edited

Loading