[ FSU ] Enabls Asynchronos FSU for forwarding #2813

jijoongmoon · 2024-12-03T01:10:04Z

This PR enables asynchronos mode for FSU (flash storage utilization)
for better performance.

It splits the load and unload tensors which make difficult to handle.
Also fix the inference execution order when it is in INFERENCE mode
and change the trainable option to false when it calls the request
weights and tensors.

Add the new function to load and unload tensors as well as check load
complete.

It also considers weight pool and tensor pool differenetly according
to the ExecutionMode. It is not use FSU mode for tensor pool for the
INFERENCE Mode.

Resolves:

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: jijoong.moon [email protected]

taos-ci · 2024-12-03T01:10:07Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2813. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci

@jijoongmoon, 💯 All CI checkers are successfully verified. Thanks.

baek2sm · 2024-12-05T10:55:10Z

Applications/SimpleFC/jni/main.cpp

@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: Apache-2.0
+/**
+ * Copyright (C) 2020 Jihoon Lee <[email protected]>


name and date?

fixed. Thanks.

baek2sm

LGTM

DonghakPark

LGTM!

DonghakPark · 2024-12-04T01:31:57Z

Applications/SimpleFC/jni/main.cpp

@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: Apache-2.0
+/**
+ * Copyright (C) 2020 Jihoon Lee <[email protected]>


fixed. Thanks.

DonghakPark · 2024-12-04T01:33:13Z

Applications/SimpleFC/jni/main.cpp

+}
+
+/**
+ * @brief Create resnet 18


fixed. Thanks.

DonghakPark · 2024-12-04T01:37:08Z

Applications/SimpleFC/jni/meson.build

@@ -0,0 +1,28 @@
+resnet_sources = [


fixed. Thanks.

djeong20

Appreciate the hard work!

djeong20 · 2024-12-06T03:29:51Z

nntrainer/models/neuralnet.cpp

+    } else {
+      NNTR_THROW_IF(((mode == ExecutionMode::INFERENCE) &&
+                     (exec_mode == ExecutionMode::TRAIN)),
+                    std::invalid_argument)


It looks like the if statement already checks mode == ExecutionMode::INFERENCE, so it wouldn't throw an exception. can we remove it?

fixed. Thanks.

djeong20 · 2024-12-06T03:37:41Z

nntrainer/models/neuralnet.cpp

+                      << std::endl;
+            ml_logd("request load tensor for %d", f + 1);
+            model_graph.LoadTensors((f / (lookahead + 1) + 1) *
+                                    (lookahead + 1));


could you explain this part of the code please?

I left the comments in code.

djeong20 · 2024-12-06T04:56:34Z

nntrainer/tensor/manager.cpp

+    }
+    async_load_tensor.erase(order);
+    ml_logd("wait and completed %d", order);
+    ;


Suggested change

;

thansk. fixed.

Describe a commit content (Until 80 colums per line) in detail ASAP. **Changes proposed in this PR:** - Added TOC generator for README.md Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR enables asynchronos mode for FSU (flash storage utilization) for better performance. It splits the load and unload tensors which make difficult to handle. Also fix the inference execution order when it is in INFERENCE mode and change the trainable option to false when it calls the request weights and tensors. Add the new function to load and unload tensors as well as check load complete. It also considers weight pool and tensor pool differenetly according to the ExecutionMode. It is not use FSU mode for tensor pool for the INFERENCE Mode. Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

SeoHyungjun · 2024-12-23T07:31:58Z

nntrainer/models/neuralnet.cpp

+
+       Step 5. Try to release the weights which has execution order less then f.
+
+       Step n. repeat next layer starting with checking the tenosrs are loaded,


Is this a typo?

Suggested change

Step n. repeat next layer starting with checking the tenosrs are loaded,

Step n. repeat next layer starting with checking the tensors are loaded,

jijoongmoon requested review from myungjoo, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun and baek2sm as code owners December 3, 2024 01:10

jijoongmoon requested review from skykongkong8, djeong20 and EunjuYang as code owners December 3, 2024 01:10

github-actions bot added the Need Review label Dec 3, 2024

jijoongmoon force-pushed the asynch branch from 6434d16 to c6ea19c Compare December 3, 2024 01:24

taos-ci approved these changes Dec 3, 2024

View reviewed changes

baek2sm reviewed Dec 5, 2024

View reviewed changes

baek2sm approved these changes Dec 5, 2024

View reviewed changes

DonghakPark approved these changes Dec 6, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Dec 6, 2024

djeong20 approved these changes Dec 6, 2024

View reviewed changes

jijoongmoon force-pushed the asynch branch 2 times, most recently from 010c1f5 to da64f66 Compare December 10, 2024 01:11

jijoongmoon force-pushed the asynch branch from da64f66 to 6c69253 Compare December 10, 2024 05:01

jijoongmoon merged commit cd17a66 into nnstreamer:main Dec 10, 2024
17 checks passed

jijoongmoon deleted the asynch branch December 10, 2024 06:17

SeoHyungjun reviewed Dec 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ FSU ] Enabls Asynchronos FSU for forwarding #2813

[ FSU ] Enabls Asynchronos FSU for forwarding #2813

jijoongmoon commented Dec 3, 2024

taos-ci commented Dec 3, 2024

taos-ci left a comment

baek2sm Dec 5, 2024

jijoongmoon Dec 10, 2024

baek2sm left a comment

DonghakPark left a comment

DonghakPark Dec 4, 2024

jijoongmoon Dec 10, 2024

DonghakPark Dec 4, 2024

jijoongmoon Dec 10, 2024

DonghakPark Dec 4, 2024

jijoongmoon Dec 10, 2024

djeong20 left a comment

djeong20 Dec 6, 2024

jijoongmoon Dec 10, 2024

djeong20 Dec 6, 2024

jijoongmoon Dec 10, 2024

djeong20 Dec 6, 2024

jijoongmoon Dec 10, 2024

SeoHyungjun Dec 23, 2024


		Step 5. Try to release the weights which has execution order less then f.

		Step n. repeat next layer starting with checking the tenosrs are loaded,

[ FSU ] Enabls Asynchronos FSU for forwarding #2813

[ FSU ] Enabls Asynchronos FSU for forwarding #2813

Conversation

jijoongmoon commented Dec 3, 2024

taos-ci commented Dec 3, 2024

taos-ci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

baek2sm left a comment

Choose a reason for hiding this comment

DonghakPark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djeong20 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment