You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you have enough RAM, it is always better to run parallel trainings with a low amount of training threads instead of parallelizing each training.
Example on a 20c/40t server, approximate theoretical time:
Mode
Threads (train)
Threads (parallel)
Time for 1 model
Time for 40 models (passes)
Demo Single
1
1
500
20000 (40)
Demo Dual
2
1
300
12000 (40)
Demo Multi
20
1
100
4000 (40)
Demo Multi + Hyperthreading
40
1
70
2600 (40)
Parallel Single
1
20 (RAM Single x20)
500
1000 (2)
Parallel Single + Hyperthreading
1
40 (RAM Single x40)
500
700 (1)
Parallel Dual
2
10 (RAM Dual x10)
300
1200 (4)
Parallel Dual + Hyperthreading
2
20 (RAM Dual x20)
300
840 (1)
With hyperthreading, timings decrease by about 30% (in theory - in reality it is about 15-25% due to overhead). Parallel versions only have the overhead of merging results together (and copying data, if not forking), which is nearly non-existent (use a parallel lapply and not a parallel for to remove most of the overhead).
Also, it will allow to skip the negative efficiency issue you may have.
The text was updated successfully, but these errors were encountered:
If you have enough RAM, it is always better to run parallel trainings with a low amount of training threads instead of parallelizing each training.
Example on a 20c/40t server, approximate theoretical time:
With hyperthreading, timings decrease by about 30% (in theory - in reality it is about 15-25% due to overhead). Parallel versions only have the overhead of merging results together (and copying data, if not forking), which is nearly non-existent (use a parallel
lapply
and not a parallelfor
to remove most of the overhead).Also, it will allow to skip the negative efficiency issue you may have.
The text was updated successfully, but these errors were encountered: