Oversubscribed SMP Performance is Ludicrously Bad #25

jszaday · 2022-01-08T16:58:18Z

In particular, for jacobi, cbench, and pingpong.

Cbench is the worst of the bunch, taking the CI more than a minute to complete:

8: Test command: /home/runner/work/charmlite/charmlite/charm/bin/charmrun "/home/runner/work/charmlite/charmlite/build/bin/pgm_cbench_benchmark" "+p2" "++ppn2"
8: Test timeout computed to be: 120
8: 
8: Running as 1 OS processes: /home/runner/work/charmlite/charmlite/build/bin/pgm_cbench_benchmark ++ppn2
8: charmrun> /usr/bin/setarch x86_64 -R mpirun -np 1 /home/runner/work/charmlite/charmlite/build/bin/pgm_cbench_benchmark ++ppn2
8: Charm++> Running in SMP mode: 1 processes, 2 worker threads (PEs) + 1 comm threads per process, 2 PEs total
8: Charm++> The comm. thread both sends and receives messages
8: Converse/Charm++ Commit ID: v7.1.0-devel-122-g064b48915
8: Charm++> Using STL-based msgQ:
8: Charm++> Message priorities have been turned off and will not be respected.
8: main> rep 1 of 16
8: main> rep 2 of 16
8: main> rep 3 of 16
8: main> rep 4 of 16
8: main> rep 5 of 16
8: main> rep 6 of 16
8: main> rep 7 of 16
8: main> rep 8 of 16
8: main> rep 9 of 16
8: main> rep 10 of 16
8: main> rep 11 of 16
8: main> rep 12 of 16
8: main> rep 13 of 16
8: main> rep 14 of 16
8: main> rep 15 of 16
8: main> rep 16 of 16
8: info> interleaved 129 broadcasts and reductions across 8 chares
8: info> average time per repetition: 4453.8 ms
8: info> average time per broadcast+reduction: 34525.6 ns
8: [Partition 0][Node 0] End of program
 8/10 Test  #8: pgm_cbench_benchmark_pe2 .........   Passed   72.52 sec

It's not uncommon to see these 34525.6 ns broadcasts+reductions on an over-subscribed PC either! We should probably try to determine what's going on here, and why the performance is so bad for these configurations.

What I've tried so far:

Enabling or disabling +CmiSleepOnIdle.
Enabling or disabling cpu topology/affinity.
Using the lockless queue (--enable-lockless-queue).

Nothing seemed to improve the situation.

The text was updated successfully, but these errors were encountered:

jszaday assigned NK-Nikunj Jan 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Oversubscribed SMP Performance is Ludicrously Bad #25

Oversubscribed SMP Performance is Ludicrously Bad #25

jszaday commented Jan 8, 2022 •

edited

Loading

Oversubscribed SMP Performance is Ludicrously Bad #25

Oversubscribed SMP Performance is Ludicrously Bad #25

Comments

jszaday commented Jan 8, 2022 • edited Loading

jszaday commented Jan 8, 2022 •

edited

Loading