Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uninitialised algorithms, move using std::memcpy #6271

Merged

Conversation

Johan511
Copy link
Contributor

@Johan511 Johan511 commented Jun 4, 2023

In case of triviably copyable types we copy using std::memcpy. Performance improvement is almost 10x

Hari Hara Naveen S added 2 commits June 3, 2023 06:21
Signed-off-by: Hari Hara Naveen S <[email protected]>
@StellarBot
Copy link

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb5415cea04ba
HPX Datetime2023-05-10T12:07:53+00:002023-06-04T18:53:34+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Datetime2023-05-10T14:50:18.616050-05:002023-06-04T14:00:18.258564-05:00
Clusternamerostamrostam

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch-

Info

PropertyBeforeAfter
HPX Commitdcb5415cea04ba
HPX Datetime2023-05-10T12:07:53+00:002023-06-04T18:53:34+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Datetime2023-05-10T14:52:35.047119-05:002023-06-04T14:02:36.669467-05:00
Clusternamerostamrostam

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)=(=)
Stream Benchmark - Scale(=)=(=)
Stream Benchmark - Triad(=)=(=)
Stream Benchmark - Copy(=)(=)=

Info

PropertyBeforeAfter
HPX Commitdcb5415cea04ba
HPX Datetime2023-05-10T12:07:53+00:002023-06-04T18:53:34+00:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Datetime2023-05-10T14:52:52.237641-05:002023-06-04T14:02:53.869239-05:00
Clusternamerostamrostam

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

Hari Hara Naveen S added 22 commits June 4, 2023 14:50
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
cf
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
@hkaiser hkaiser added this to the 1.10.0 milestone Jul 16, 2023
Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace all the try { /* do loop */ } catch(...) { /*cleanup */} with loop_with_cleanup?

Hari Hara Naveen S and others added 7 commits July 16, 2023 10:59
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
Signed-off-by: Hari Hara Naveen S <[email protected]>
@Johan511
Copy link
Contributor Author

@hkaiser requested changes have been made, only existing try-catch blocks are in loop_with_cleanup* functions

Signed-off-by: Hari Hara Naveen S <[email protected]>
@Johan511
Copy link
Contributor Author

@hkaiser can you clarify what does HPX_COMPUTE_DEVICE_CODE signify over here

#if !defined(HPX_COMPUTE_DEVICE_CODE)
inline constexpr loop_t loop = loop_t{};
#else
template <typename ExPolicy, typename Begin, typename End, typename F>
HPX_HOST_DEVICE HPX_FORCEINLINE constexpr Begin loop(
ExPolicy&& policy, Begin begin, End end, F&& f)
{
return hpx::parallel::util::loop_t{}(
HPX_FORWARD(ExPolicy, policy), begin, end, HPX_FORWARD(F, f));
}
template <typename ExPolicy, typename Begin, typename End,
typename CancelToken, typename F>
HPX_HOST_DEVICE HPX_FORCEINLINE constexpr Begin loop(
ExPolicy&& policy, Begin begin, End end, CancelToken& tok, F&& f)
{
return hpx::parallel::util::loop_t{}(
HPX_FORWARD(ExPolicy, policy), begin, end, tok, HPX_FORWARD(F, f));
}
#endif

@hkaiser
Copy link
Member

hkaiser commented Jul 24, 2023

@Johan511 The code in question fails compiling when using nvcc. The #else case adds a workaround. Both cases are functionally equivalent.

Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better now. One question, though: you seem to have removed the cancellation tokens altogether. Can you explain the rationale for doing so, please?

@Johan511
Copy link
Contributor Author

@hkaiser can you please point out to the lines you are referring to? In most places I have tried not to change the implementations much.

In some cases where I wrote a function which calls loop_with_cleanup, I have used the loop without any cancellation tokens, I don't see a point to cancellation tokens in this context as the only case we end the loop prematurely is when an error is thrown. This is taken case by try-catch and I don't believe it needs any token.

Is there something I am missing or would you suggest I do away with tokens completely?

@hkaiser
Copy link
Member

hkaiser commented Jul 31, 2023

@hkaiser can you please point out to the lines you are referring to? In most places I have tried not to change the implementations much.

In some cases where I wrote a function which calls loop_with_cleanup, I have used the loop without any cancellation tokens, I don't see a point to cancellation tokens in this context as the only case we end the loop prematurely is when an error is thrown. This is taken case by try-catch and I don't believe it needs any token.

Is there something I am missing or would you suggest I do away with tokens completely?

So I guess your rationale of removing the cancellation tokens was that those would matter only in case of errors. That means we were slowing down the success path in order to speed up the error path. I agree to remove them in this case.

@Johan511
Copy link
Contributor Author

Johan511 commented Jul 31, 2023

That means we were slowing down the success path in order to speed up the error path.

loop_with_cleanup was checking for tok only at the start.

Not sure how having tok was speeding up the error path, we were still throwing the error not doing cleanup with token.

@hkaiser
Copy link
Member

hkaiser commented Jul 31, 2023

That means we were slowing down the success path in order to speed up the error path.

loop_with_cleanup was checking for tok only at the start.

Not sure how having tok was speeding up the error path, we were still throwing the error not doing cleanup with token.

The token was used to stop doing work on the partitions that were not erroneous.

@Johan511
Copy link
Contributor Author

Johan511 commented Aug 3, 2023

The token was used to stop doing work on the partitions that were not erroneous.

We don't seem to explicitly cancel the token in the catch block.
Ref.

@hkaiser
Copy link
Member

hkaiser commented Aug 3, 2023

The token was used to stop doing work on the partitions that were not erroneous.

We don't seem to explicitly cancel the token in the catch block. Ref.

As I said, I'm fine with removing the token if we were using it just to speed up the error path without helping the success path.

Signed-off-by: Hari Hara Naveen S <[email protected]>
Copy link
Member

@hkaiser hkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in now. Thanks a lot!

@hkaiser
Copy link
Member

hkaiser commented Aug 5, 2023

bors merge

@bors
Copy link

bors bot commented Aug 5, 2023

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot merged commit cff5316 into STEllAR-GROUP:master Aug 5, 2023
17 checks passed
@Johan511
Copy link
Contributor Author

Johan511 commented Aug 9, 2023

uninit-copy

Performance numbers for uninit-copy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants