-
Notifications
You must be signed in to change notification settings - Fork 54
Overview of differences in "classic McStas McXtrace" and the 3.x versions
Peter Willendrup edited this page Feb 8, 2022
·
3 revisions
(For an explanation of individual keywords mentioned below, please consult McStas/McXtrace 3 and GPU terminology (table))
- Monitor_nD user1-3 user variables have changed type from a flexible define to a string type. Use together with instrument-level
USERVARS
. - On GPU, Monitor_nD only saves event-mode data to disk at the end of the simulation in
FINALIZE
, interfacing with GPU via event-arrays of maximum size 2e9 particles, see also remarks in next section. - On GPU, MCPL_input and MCPL_output perform their file-operations in
INITIALIZE
andFINALIZE
respectively, interfacing with GPU via event-arrays of maximum size 2e9 particles, see also remarks in next section. - Components that use cpu-side runtime code within TRACE are marked
NOACC
, meaning that they will triggerFUNNEL
mode and make your simulation a combined CPU-GPU computation. Such components are: - NCrystal_sample
- Sample_nxs
- Multilayer_Sample
- Several of the SANS* component from M. Kramer
- Virtual_mcnp*
- Virtual_input and output
- SasView_model
- Union_master (is expected to become GPU-capable within 1-2 releases after McStas 3.1)
- In a classic McStas simulation, each particle is calculated from start to end of the instrument in one go. In the new
FUNNEL
mode, your instrument is divided in sections that are finalised for a batch of particles (of size--gpu_innerloop
- default 2e9) before the next section is calculated. The divisions occur at: - Any component that is marked
CPU COMPONENT
in the instrument-file or components with theNOACC
keyword in the header - At any
SPLIT
occurrence in the instrument. InFUNNEL
aSPLIT
will be performed according to the ratio betweenABSORBED
and still available neutrons within thegpu_innerlooop
buffer - In the
FUNNEL
mode that allows instruments that do both CPU and GPU calculations, theSPLIT
keyword is disabled. For this reason, a few instruments present on our nightly test in the right-mostFUNNEL
column give different output (status "red"). If usingSPLIT
inFUNNEL
mode, you will receive the following type of warning at compile- and runtime:
NOTE: CPU COMPONENT grammar activated:
1) "FUNNEL" raytrace algorithm enabled.
2) Any SPLIT's are dynamically allocated based on available buffer size.
WARNING:
--> JUMP found at COMPONENT 9, CG_2_out
--> JUMPS are not supported in FUNNEL mode and are ignored
--> Your instrument may give different output with FUNNEL
- On GPU the maximum number of particles that can be executed within a CUDA kernel is
MAX_INT
/ 2e9, meaning that if you select a higher ncount you will get in multiple, serial GPU-kernels / CPU compute sections. We have not yet implemented a good, fully-GPU solution for running event-list simulations in such very long simulations, so in such cases your output lists will be capped at 2e9 maximum. As a workaround we have included specialNOACC
versions of MCPL_output and Monitor_nD that will triggerFUNNEL
mode and multiple saves from a CPU-section of your simulation.
A set of new input parameters are enabled in any McStas 3.1 GPU instrument, allowing to investigate and control GPU parallelism at runtime (please consult the OpenACC website or other OpenACC resources for details):
-
--gpu_innerloop
(maximum/default 2e9) controls the number of particle threads executed within a CUDA kernel -
--numgangs
(maximum/default 7813) controls the number of OpenACC gangs active in the parallelism -
--vecsize
(maximum/default 128) controls the OpenACC vector size At this point our recommendation is to leave these setting at their defaults. Sometimes a simulation ending in anerror 700
/ GPU segfault can be made to run by lowering e.g.numgangs
, but any such error is the indication of thread/memory collisions and presents a problem to be fixed. Thus, at sucherror 700
indicates that there is a problem to fix in a component. :)
- For now, the "scatter logger" and "shielding logger" post-processing approaches are not yet available in McStas 3. Use McStas 2 entirely or
MCPL
to interface McStas 2 and McStas 3 simulations.
- ESS_BIFROST_shielding and PSI_Focus_shielding use "shielding logger" which is not yet available in McStas 3
- MCPL_merge reads/writes MCPL data in TRACE which is not available on GPU
- Test_Magnetic_Userdefined uses a user-supplied field-function via a function pointer, a feature not available on GPU
- Test_Monitor_Sqw uses a symbolic expression to measure
S(Q,w)
, which is not available in McStas 3. Please instead use Test_Sqw_monitor instead, where a solution using McStas 3USERVARS
is implemented. - Test_Scatter_log_* all use "scatter logger" which is not yet available in McStas 3
- Test_Single_crystal_inelastic is not included since the related component is not yet ported to McStas 3
- Test_shellguides is not included since the related components are not yet ported to McStas 3
- Test_single_magnetic_crystal is not included since the related component is not yet ported to McStas 3
- Union_NCrystal_example* is for now not available. The expectation is that this will come later for
FUNNEL
mode - Union_test_abs_logger* is for now not available. The expectation is that this will come later.
- Union_test_mesh is for now not included as the partial GPU-port of Union does not yet support mesh geometries.
- Shielding-logger and scatter-logger components
- The nested mirror geometry Guide_four_side* components
- Single_crystal_inelastic
- Magnetic_Single_crystal