diff --git a/docs/api.rst b/docs/api.rst deleted file mode 100644 index 04d21dd9e..000000000 --- a/docs/api.rst +++ /dev/null @@ -1,8 +0,0 @@ -.. Doxygen - -Doxygen documentation -========================== - -Auto-generated Doxygen documentation for the classes and functions in Quokka is here_. - -.. _here: https://quokka-code.readthedocs.io/en/latest/_static/html/files.html \ No newline at end of file diff --git a/docs/conf.py b/docs/conf.py index ef7d08162..9b4329520 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -61,8 +61,3 @@ # a list of builtin themes. # html_theme = 'sphinx_rtd_theme' - -# Add any paths that contain custom static files (such as style sheets) here, -# relative to this directory. They are copied after the builtin static files, -# so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ['doxygen'] diff --git a/docs/debugging.rst b/docs/debugging.rst index f41393bb5..d86a9660b 100644 --- a/docs/debugging.rst +++ b/docs/debugging.rst @@ -1,7 +1,7 @@ .. Debugging Debugging -===== +========= General guidelines ----------------------- @@ -43,27 +43,27 @@ The best way to debug on GPUs is to... not debug on GPUs. That is, it is always * On AMD GPUs, there is a `GPU-aware AddressSanitizer `_. Currently, enabling this requires manually changing the compiler flags. How to actually debug on GPUs ------------------------ +----------------------------- As an *absolute last resort* if it is impossible to reproduce the error you are seeing on a CPU-only run, then the best option is to: * downsize the simulation to fit on a single GPU * start the simulation on an NVIDIA GPU from within CUDA-GDB - (see the `documentation `_ and `slides `_). + (see the `CUDA-GDB documentation `_ and `slides `_). * hope CUDA-GDB does not itself crash * hope CUDA-GDB produces a useful error message that you can analyze -NVIDIA also provides the ``compute-sanitizer`` tool that is essentially the equivalent of AddressSanitizer (see the `documentation `_). Unfortunately, it does not work as reliably as AddressSanitizer, and may itself crash while attempting to debug a GPU program. +NVIDIA also provides the ``compute-sanitizer`` tool that is essentially the equivalent of AddressSanitizer (see the `ComputeSanitizer documentation `_). Unfortunately, it does not work as reliably as AddressSanitizer, and may itself crash while attempting to debug a GPU program. For AMD GPUs, you have to use the AMD-provided debugger ``rocgdb``. A tutorial its use is available `here `_. AMD also provides a GPU-aware AddressSanitizer that can be enabled when building Quokka. Currently, the compiler flags must be manually modified in order to enable this. For details, see its `documentation `_. GPU kernel asynchronicity ------------------------ +------------------------- **By default, GPU kernels launch asynchronously, i.e., execution of CPU code continues before the kernel starts on the GPU. This can cause synchronization problems if there is an implicit assumption about the order of operations with respect to CPU and GPU code.** @@ -77,6 +77,6 @@ This will cause the CPU to wait until the GPU kernel execution is complete befor For more details, refer to the `AMReX GPU debugging guide `_. When all else fails: Debugging with ``printf`` ------------------------ +---------------------------------------------- If you have tried *all* of the above steps, then you have to resort to adding ``printf`` statements within the GPU code. Note that ``printf`` inside GPU code is different from the CPU-side ``printf`` function, as explained in the `NVIDIA documentation `_. diff --git a/docs/error_checking.rst b/docs/error_checking.rst index 5a8e3de57..537325412 100644 --- a/docs/error_checking.rst +++ b/docs/error_checking.rst @@ -1,7 +1,7 @@ .. Assertions and error checking Assertions and error checking -========================== +============================= AMReX assert macros ----------------------- diff --git a/docs/howto_clang_tidy.rst b/docs/howto_clang_tidy.rst index 3fda54fbe..361f010f6 100644 --- a/docs/howto_clang_tidy.rst +++ b/docs/howto_clang_tidy.rst @@ -6,14 +6,14 @@ How to use ``clang-tidy`` ``clang-tidy`` is `a command-line tool `_ that automatically enforces certain aspects of code style and provides warnings for common programming mistakes. It automatically runs on every pull request in the Quokka GitHub repository. Using clang-tidy with VSCode ------------------------ +---------------------------- The easiest way to use ``clang-tidy`` on your own computer is to install the `clangd extension for Visual Studio Code `_ (VSCode). (VSCode itself can be downloaded `here `_.) Command-line alternative ------------------------ +------------------------ You can also run ``clang-tidy`` from the command line (see the `documentation `_). diff --git a/docs/index.rst b/docs/index.rst index da365c2fa..0f63fb084 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -78,7 +78,6 @@ Developer Guide error_checking performance howto_clang_tidy - api Epilogue --------------- diff --git a/docs/insitu_analysis.rst b/docs/insitu_analysis.rst index 992e26b60..1454d1da9 100644 --- a/docs/insitu_analysis.rst +++ b/docs/insitu_analysis.rst @@ -1,7 +1,7 @@ .. insitu_analysis In-situ analysis -===== +================ *In-situ analysis* refers to analyzing the simulations as they are running. There are two options: using the *runtime diagnostics* that are built-in to Quokka, and using *Ascent*, a third-party library. @@ -27,7 +27,9 @@ Currently, using this diagnostic requires implementing a custom function in the The problem generator must call `computePlaneProjection(F const &user_f, const int dir)` where `user_f` is a lambda function that returns the value to project and `dir` is the axis along which the projection is taken. -*Example problem generator implementation:* :: +*Example problem generator implementation:* + +.. code-block:: cpp template <> auto RadhydroSimulation::ComputeProjections(const int dir) const -> std::unordered_map> { @@ -42,7 +44,9 @@ where `user_f` is a lambda function that returns the value to project and `dir` return proj; } -*Example input file configuration:* :: +*Example input file configuration:* + +.. code-block:: ini projection_interval = 200 projection.dirs = x z @@ -55,7 +59,9 @@ where `user_f` is a lambda function that returns the value to project and `dir` This outputs 2D slices of the simulation as AMReX plotfiles that can be further examined using, e.g., VisIt or yt. -*Example input file configuration:* :: +*Example input file configuration:* + +.. code-block:: ini quokka.diagnostics = slice_z # Space-separated name(s) of diagnostics (arbitrary) quokka.slice_z.type = DiagFramePlane # Diagnostic type (others may be added in the future) @@ -83,7 +89,9 @@ By default, the bins extend over the full range of the data at a given timestep. Normalization of the output is left up to the user. -*Example input file configuration:* :: +*Example input file configuration:* + +.. code-block:: ini quokka.hist_temp.type = DiagPDF # Diagnostic type quokka.hist_temp.file = PDFTempDens # Output file prefix @@ -100,7 +108,9 @@ Normalization of the output is left up to the user. quokka.hist_temp.gasDensity.range = 1e-29 1e-23 # gasDensity: (Optional, default: data range) Specify min/max of bins -*Filters (based on any variables, not necessary those used for the histogram) can be optionally added:* :: +*Filters (based on any variables, not necessary those used for the histogram) can be optionally added:* + +.. code-block:: ini quokka.hist_temp.filters = dense # (Optional) List of filters quokka.hist_temp.dense.field_name = gasDensity # Filter field @@ -117,7 +127,7 @@ Ascent allows you to generate visualizations (as PNG images) while the simulatio ``export Ascent_DIR=/software/projects/pawsey0807/bwibking/ascent_06082023/install/ascent-develop/lib/cmake/ascent``. Compiling Ascent via Spack -^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^ 1. Run ``spack external find``. 2. Make sure there are entries listed for ``hdf5``, ``cuda``, and ``openmpi`` in your ``~/.spack/packages.yaml`` file. 3. Add `buildable: False `_ to each entry. @@ -128,13 +138,13 @@ For A100 GPUs, change the above lines to `cuda_arch=80`. Currently, it's not possible to `build for both GPU models at the same time `_. Compiling Quokka with Ascent support -^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1. Load Ascent: ``spack load ascent`` 2. Add ``-DAMReX_ASCENT=ON -DAMReX_CONDUIT=ON`` to your CMake options. 3. Compile your problem, e.g.: ``ninja -j4 test_hydro3d_blast`` Customizing the visualization -^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Add an `ascent_actions.yaml file `_ to the simulation working directory. This file can even be edited while the simulation is running! diff --git a/docs/instability.rst b/docs/instability.rst index 42614563c..cb168dd50 100644 --- a/docs/instability.rst +++ b/docs/instability.rst @@ -1,7 +1,7 @@ .. Debugging simulation instability Debugging simulation instability -===== +================================ Nonlinear stability of systems of PDEs is an unsolved problem. There is no complete, rigorous mathematical theory. There are two concepts, however, that are closely associated with nonlinear stability: @@ -24,7 +24,7 @@ It is also possible that the entropy is nondecreasing, but insufficient entropy compared to the amount that should be produced physically. This will cause an unphysical oscillatory solution. Ways to improve stability ------------------------ +------------------------- The solution is either to reduce the timestep or add additional dissipation: * set the initial timestep to be 0.1 or 0.01 of the CFL timestep by setting ``sim.initDt_`` appropriately diff --git a/docs/performance.rst b/docs/performance.rst index 4e7f8dd2f..e2add0aa2 100644 --- a/docs/performance.rst +++ b/docs/performance.rst @@ -1,7 +1,7 @@ .. Performance Performance tips -===== +================ Prerequisites ----------------------- @@ -15,7 +15,7 @@ You should: * Know that calling `amrex::ParallelFor` launches a GPU kernel (when GPU support is enabled at compile time). GPU hardware characteristics ------------------------ +---------------------------- GPUs have hardware design features that make their performance characteristics significantly different from CPUs. In practice, two factors dominate GPU performance behavior: @@ -26,7 +26,7 @@ GPUs have hardware design features that make their performance characteristics s * For more details, see these `AMD website notes `_ and OLCF `training materials `_. MPI communication latency vs. bandwidth ------------------------ +--------------------------------------- A traditional rule of thumb for CPU-based MPI codes is that communication latency often limits performance when scaling to large number of CPU cores (or, equivalently, MPI ranks). We have found that this is *not* the case for Quokka when running on GPU nodes (by, e.g., adding additional dummy variables to the state arrays). diff --git a/docs/postprocessing.rst b/docs/postprocessing.rst index 98736f1c1..0a77bc9a4 100644 --- a/docs/postprocessing.rst +++ b/docs/postprocessing.rst @@ -1,7 +1,7 @@ .. Postprocessing Postprocessing -===== +============== There are several ways to post-process the output of Quokka simulations. AMReX PlotfileTools, yt, and VisIt all allow you to analyze the outputs after they are written to disk. @@ -30,7 +30,9 @@ yt PlotfileTools (see above) can be used instead for axis-aligned slice plots. The plotfile directory can be loaded with ``yt.load`` as usual. However, the standard fields such as ``('gas', 'density')`` are not defined. -Instead, you have to use non-standard fields. Examine ``ds.field_list`` to see the fields that exist in the plotfiles. These should be: :: +Instead, you have to use non-standard fields. Examine ``ds.field_list`` to see the fields that exist in the plotfiles. These should be: + +.. code-block:: python [('boxlib', 'gasDensity'), ('boxlib', 'gasEnergy'), ('boxlib', 'radEnergy'), ('boxlib', 'scalar'), diff --git a/docs/running_on_hpc_clusters.rst b/docs/running_on_hpc_clusters.rst index 769cb7c03..7d9a8e293 100644 --- a/docs/running_on_hpc_clusters.rst +++ b/docs/running_on_hpc_clusters.rst @@ -1,7 +1,7 @@ .. Running on HPC clusters Running on HPC clusters -===== +======================= Instructions for running on various HPC clusters are given below. @@ -35,7 +35,7 @@ Then a single-node test job can be run with: :: sbatch scripts/setonix-1node.submit Workaround for interconnect issues -^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If interconnect issues are observed, it is recommended to add the line :: diff --git a/docs/tests/radhydro_pulse.rst b/docs/tests/radhydro_pulse.rst index 63f2d8e52..4df018f98 100644 --- a/docs/tests/radhydro_pulse.rst +++ b/docs/tests/radhydro_pulse.rst @@ -1,7 +1,7 @@ .. Advecting radiation pulse test Advecting radiation pulse test -========================= +============================== This test demonstrates the code’s ability to deal with the relativistic correction source terms that arise from the mixed frame formulation of