Skip to content

McStas McXtrace 3 and GPU terminology (table)

Peter Willendrup edited this page Nov 30, 2022 · 2 revisions

This document is WORK IN PROGRESS

General concepts

Concept Description
Compute section Section (containing one or more component codes) in the instrument performing calculations, either on CPU or in a GPU kernel
GPU kernel A GPU Compute Section. Section (containing one or more component codes) in the instrument performing calculations on GPU. In non-FUNNEL mode all components will be put in one GPU kernel, in FUNNEL mode multiple GPU kernels / Compute sections will run in succession, switching from GPU to CPU if needed.

Parallelisation schemes and parallelisation control

Calculation modes, parallelism Description
non-FUNNEL mode (default) All TRACE code from all components is assembled in a single Compute section / GPU kernel. Each neutron thread is calculated from start to end of the instrument. On CPU this happens in series, on GPU in parallel.
FUNNEL mode Calculations are performed for a bunch of neutrons (size gpu_innerloop) within in multiple Compute sections / GPU kernels. Component code is put together in such sections until either a SPLIT occurs in the instrument or where passing from GPU to CPU or vice versa is needed, see CPU COMPONENT and NOACC
ncount Setting for the total number of particle histories to simulate through your instrument.
gpu_innerloop Used on GPU only. (Default and maximum: ~2e9, corresponding to MAX_INT. 2e9 is the maximum number of thread executions in a kernel running on your NVIDIA GPU). When ncount is higher than gpu_innerloop setting, a series of calculations each with gpu_innerloop threads will run in your GPU kernel(s).
numgangs Used on GPU only. (Default: 7813) Used to define the number of thread gangs used for parallelisation in your kernel(s), see explanation from the OpenACC course.
vecsize Used on GPU only. (Default: 128) Used to define the number of thread vectors used for parallelisation in your kernel(s), see explanation from the OpenACC course.

The particle struct and particle scope

Particle-scope Description
particle struct Contains particle state variables and USERVARS as defined in the instrument (e.g. particle-dependent flags).
x,y,z particle position variables
vx,vy,vz particle velocity variables
sx,sy,sz particle polarisation variables
t particle time variable
p particle weight/intensity variable
randstate RNG state carried independently pr. particle
_uid Particle "thread number"

Grammar and keyword behaviour in the instrument file

Instrument-scope Description
USERVARS A new McStas 3.x section of the instrumentfile. Allows to define flags / particle dependent variables, embedded in the particle state. These can be used with the user1 .. user3 options of Monitor_nD and read by other components by means of the particle_getvar() function.
CPU COMPONENT Keyword for the instrument-grammar, indicating that computations for this component should happen on CPU. Useful for components that work both on GPU and CPU, but where CPU computation is preferred. Automatically selects that the instrument will be running in the FUNNEL mode.
SPLIT Keyword for the instrument-grammar indicating that neutrons making it here should be "repeated", preserving total intensity. In non-FUNNEL mode a grammar SPLIT 10 means that per incoming particle, 10 identical particles of weight 1/10 will be computed from this point onwards. In FUNNEL mode the repetition is not an input, but within the buffer size gpu_innerloop, any ABSORB'ed particles are replaced by still active ones. Intensity is preserved.
JUMP A in instrument-grammar keyword that let's particle teleport from one component to another (beware, a GOTO.) Not supported in FUNNEL mode

Keywords and important things to know in component scope

Component-scope Description
NOACC Header-keyword for the component-grammar, indicating that the component can not run on GPU. Useful for components that include algorithms that can not be ported to GPU, e.g. those that use external (non-mcstas and non-GPU) libraries / functions. Automatically selects that the instrument will be running in the FUNNEL mode.
Component-struct Data that represents a component instance. Two Guide elements of the same type will have independent structs.
DECLARE In components, the declare section must contain single variable declarations, each on an independent line and without an assignment.
SHARE Used to define functions and data-types to be used within the component. Functions that are to be used in TRACE on GPU should receive a #pragma acc routine
INITIALIZE Forms a functions that fills runtime-parameters in the component struct.
TRACE Section in the component that performs calculations on an incoming particle, defining an outgoing particle.
Thread-safe TRACE As particles are calculated in a highly parallel fashion on GPU, quantities that depend on the particle state can not be DECLARE / component variables that are in a common scope. You should instead declare these quantities as local variables within the component TRACE. A symptom of a non-thread-safe component is that GPU and CPU results differ or that GPU results are not constant if repeated with the same seed.
#pragma acc atomic [capture/write/update/] Used to define that the following line of code must be performed by one GPU thread at the time only. Use this to e.g. save data in Monitor histogram arrays and other multi-particle dependent data.
Clone this wiki locally