Skip to content

Commit

Permalink
fix Sphinx warnings (#688)
Browse files Browse the repository at this point in the history
### Description
Fixes Sphinx warnings. The documentation now compiles without warnings.

### Related issues
N/A

### Checklist
_Before this pull request can be reviewed, all of these tasks should be
completed. Denote completed tasks with an `x` inside the square brackets
`[ ]` in the Markdown source below:_
- [x] I have added a description (see above).
- [ ] I have added a link to any related issues see (see above).
- [x] I have read the [Contributing
Guide](https://github.com/quokka-astro/quokka/blob/development/CONTRIBUTING.md).
- [ ] I have added tests for any new physics that this PR adds to the
code.
- [x] I have tested this PR on my local computer and all tests pass.
- [ ] I have manually triggered the GPU tests with the magic comment
`/azp run`.
- [x] I have requested a reviewer for this PR.
  • Loading branch information
BenWibking authored Jul 28, 2024
1 parent 514e5a7 commit da1931a
Show file tree
Hide file tree
Showing 12 changed files with 40 additions and 42 deletions.
8 changes: 0 additions & 8 deletions docs/api.rst

This file was deleted.

5 changes: 0 additions & 5 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,3 @@
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['doxygen']
12 changes: 6 additions & 6 deletions docs/debugging.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Debugging
Debugging
=====
=========

General guidelines
-----------------------
Expand Down Expand Up @@ -43,27 +43,27 @@ The best way to debug on GPUs is to... not debug on GPUs. That is, it is always
* On AMD GPUs, there is a `GPU-aware AddressSanitizer <https://rocm.docs.amd.com/en/latest/understand/using_gpu_sanitizer.html#compiling-for-address-sanitizer>`_. Currently, enabling this requires manually changing the compiler flags.

How to actually debug on GPUs
-----------------------
-----------------------------

As an *absolute last resort* if it is impossible to reproduce the error you are seeing on a CPU-only run, then the best option is to:

* downsize the simulation to fit on a single GPU

* start the simulation on an NVIDIA GPU from within CUDA-GDB
(see the `documentation <https://docs.nvidia.com/cuda/cuda-gdb/index.html>`_ and `slides <https://www.olcf.ornl.gov/wp-content/uploads/2021/06/cuda_training_series_cuda_debugging.pdf>`_).
(see the `CUDA-GDB documentation <https://docs.nvidia.com/cuda/cuda-gdb/index.html>`_ and `slides <https://www.olcf.ornl.gov/wp-content/uploads/2021/06/cuda_training_series_cuda_debugging.pdf>`_).

* hope CUDA-GDB does not itself crash

* hope CUDA-GDB produces a useful error message that you can analyze

NVIDIA also provides the ``compute-sanitizer`` tool that is essentially the equivalent of AddressSanitizer (see the `documentation <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>`_). Unfortunately, it does not work as reliably as AddressSanitizer, and may itself crash while attempting to debug a GPU program.
NVIDIA also provides the ``compute-sanitizer`` tool that is essentially the equivalent of AddressSanitizer (see the `ComputeSanitizer documentation <https://docs.nvidia.com/compute-sanitizer/ComputeSanitizer/index.html>`_). Unfortunately, it does not work as reliably as AddressSanitizer, and may itself crash while attempting to debug a GPU program.

For AMD GPUs, you have to use the AMD-provided debugger ``rocgdb``. A tutorial its use is available `here <https://www.olcf.ornl.gov/wp-content/uploads/2021/04/rocgdb_hipmath_ornl_2021_v2.pdf>`_.

AMD also provides a GPU-aware AddressSanitizer that can be enabled when building Quokka. Currently, the compiler flags must be manually modified in order to enable this. For details, see its `documentation <https://rocm.docs.amd.com/en/latest/understand/using_gpu_sanitizer.html#compiling-for-address-sanitizer>`_.

GPU kernel asynchronicity
-----------------------
-------------------------

**By default, GPU kernels launch asynchronously, i.e., execution of CPU code continues before the kernel starts on the GPU. This can cause synchronization problems if there is an implicit assumption about the order of operations with respect to CPU and GPU code.**

Expand All @@ -77,6 +77,6 @@ This will cause the CPU to wait until the GPU kernel execution is complete befor
For more details, refer to the `AMReX GPU debugging guide <https://amrex-codes.github.io/amrex/docs_html/Debugging.html#basic-gpu-debugging>`_.

When all else fails: Debugging with ``printf``
-----------------------
----------------------------------------------

If you have tried *all* of the above steps, then you have to resort to adding ``printf`` statements within the GPU code. Note that ``printf`` inside GPU code is different from the CPU-side ``printf`` function, as explained in the `NVIDIA documentation <https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#formatted-output>`_.
2 changes: 1 addition & 1 deletion docs/error_checking.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Assertions and error checking
Assertions and error checking
==========================
=============================

AMReX assert macros
-----------------------
Expand Down
4 changes: 2 additions & 2 deletions docs/howto_clang_tidy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ How to use ``clang-tidy``
``clang-tidy`` is `a command-line tool <https://clang.llvm.org/extra/clang-tidy/>`_ that automatically enforces certain aspects of code style and provides warnings for common programming mistakes. It automatically runs on every pull request in the Quokka GitHub repository.

Using clang-tidy with VSCode
-----------------------
----------------------------

The easiest way to use ``clang-tidy`` on your own computer is to install the `clangd extension for Visual Studio Code <https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.vscode-clangd>`_ (VSCode).

(VSCode itself can be downloaded `here <https://code.visualstudio.com/>`_.)

Command-line alternative
-----------------------
------------------------

You can also run ``clang-tidy`` from the command line (see the `documentation <https://clang.llvm.org/extra/clang-tidy/#using-clang-tidy>`_).

Expand Down
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,6 @@ Developer Guide
error_checking
performance
howto_clang_tidy
api

Epilogue
---------------
Expand Down
28 changes: 19 additions & 9 deletions docs/insitu_analysis.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. insitu_analysis
In-situ analysis
=====
================

*In-situ analysis* refers to analyzing the simulations as they are running.
There are two options: using the *runtime diagnostics* that are built-in to Quokka, and using *Ascent*, a third-party library.
Expand All @@ -27,7 +27,9 @@ Currently, using this diagnostic requires implementing a custom function in the
The problem generator must call `computePlaneProjection(F const &user_f, const int dir)`
where `user_f` is a lambda function that returns the value to project and `dir` is the axis along which the projection is taken.

*Example problem generator implementation:* ::
*Example problem generator implementation:*

.. code-block:: cpp
template <> auto RadhydroSimulation<ShockCloud>::ComputeProjections(const int dir) const -> std::unordered_map<std::string, amrex::BaseFab<amrex::Real>>
{
Expand All @@ -42,7 +44,9 @@ where `user_f` is a lambda function that returns the value to project and `dir`
return proj;
}
*Example input file configuration:* ::
*Example input file configuration:*

.. code-block:: ini
projection_interval = 200
projection.dirs = x z
Expand All @@ -55,7 +59,9 @@ where `user_f` is a lambda function that returns the value to project and `dir`

This outputs 2D slices of the simulation as AMReX plotfiles that can be further examined using, e.g., VisIt or yt.

*Example input file configuration:* ::
*Example input file configuration:*

.. code-block:: ini
quokka.diagnostics = slice_z # Space-separated name(s) of diagnostics (arbitrary)
quokka.slice_z.type = DiagFramePlane # Diagnostic type (others may be added in the future)
Expand Down Expand Up @@ -83,7 +89,9 @@ By default, the bins extend over the full range of the data at a given timestep.

Normalization of the output is left up to the user.

*Example input file configuration:* ::
*Example input file configuration:*

.. code-block:: ini
quokka.hist_temp.type = DiagPDF # Diagnostic type
quokka.hist_temp.file = PDFTempDens # Output file prefix
Expand All @@ -100,7 +108,9 @@ Normalization of the output is left up to the user.
quokka.hist_temp.gasDensity.range = 1e-29 1e-23 # gasDensity: (Optional, default: data range) Specify min/max of bins
*Filters (based on any variables, not necessary those used for the histogram) can be optionally added:* ::
*Filters (based on any variables, not necessary those used for the histogram) can be optionally added:*

.. code-block:: ini
quokka.hist_temp.filters = dense # (Optional) List of filters
quokka.hist_temp.dense.field_name = gasDensity # Filter field
Expand All @@ -117,7 +127,7 @@ Ascent allows you to generate visualizations (as PNG images) while the simulatio
``export Ascent_DIR=/software/projects/pawsey0807/bwibking/ascent_06082023/install/ascent-develop/lib/cmake/ascent``.

Compiling Ascent via Spack
^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^
1. Run ``spack external find``.
2. Make sure there are entries listed for ``hdf5``, ``cuda``, and ``openmpi`` in your ``~/.spack/packages.yaml`` file.
3. Add `buildable: False <https://spack.readthedocs.io/en/latest/build_settings.html#external-packages>`_ to each entry.
Expand All @@ -128,13 +138,13 @@ For A100 GPUs, change the above lines to `cuda_arch=80`.
Currently, it's not possible to `build for both GPU models at the same time <https://github.com/Alpine-DAV/ascent/issues/950#issuecomment-1153243232>`_.

Compiling Quokka with Ascent support
^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1. Load Ascent: ``spack load ascent``
2. Add ``-DAMReX_ASCENT=ON -DAMReX_CONDUIT=ON`` to your CMake options.
3. Compile your problem, e.g.: ``ninja -j4 test_hydro3d_blast``

Customizing the visualization
^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Add an `ascent_actions.yaml file <https://ascent.readthedocs.io/en/latest/Actions/Actions.html>`_ to the simulation working directory.
This file can even be edited while the simulation is running!

Expand Down
4 changes: 2 additions & 2 deletions docs/instability.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Debugging simulation instability
Debugging simulation instability
=====
================================

Nonlinear stability of systems of PDEs is an unsolved problem. There is no complete, rigorous mathematical theory.
There are two concepts, however, that are closely associated with nonlinear stability:
Expand All @@ -24,7 +24,7 @@ It is also possible that the entropy is nondecreasing, but insufficient entropy
compared to the amount that should be produced physically. This will cause an unphysical oscillatory solution.

Ways to improve stability
-----------------------
-------------------------
The solution is either to reduce the timestep or add additional dissipation:

* set the initial timestep to be 0.1 or 0.01 of the CFL timestep by setting ``sim.initDt_`` appropriately
Expand Down
6 changes: 3 additions & 3 deletions docs/performance.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Performance
Performance tips
=====
================

Prerequisites
-----------------------
Expand All @@ -15,7 +15,7 @@ You should:
* Know that calling `amrex::ParallelFor` launches a GPU kernel (when GPU support is enabled at compile time).

GPU hardware characteristics
-----------------------
----------------------------

GPUs have hardware design features that make their performance characteristics significantly different from CPUs. In practice, two factors dominate GPU performance behavior:

Expand All @@ -26,7 +26,7 @@ GPUs have hardware design features that make their performance characteristics s
* For more details, see these `AMD website notes <https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-register-pressure-readme/>`_ and OLCF `training materials <https://www.olcf.ornl.gov/wp-content/uploads/Intro_Register_pressure_ORNL_20220812_2083.pdf>`_.

MPI communication latency vs. bandwidth
-----------------------
---------------------------------------

A traditional rule of thumb for CPU-based MPI codes is that communication latency often limits performance when scaling to large number of CPU cores (or, equivalently, MPI ranks). We have found that this is *not* the case for Quokka when running on GPU nodes (by, e.g., adding additional dummy variables to the state arrays).

Expand Down
6 changes: 4 additions & 2 deletions docs/postprocessing.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Postprocessing
Postprocessing
=====
==============

There are several ways to post-process the output of Quokka simulations.
AMReX PlotfileTools, yt, and VisIt all allow you to analyze the outputs after they are written to disk.
Expand Down Expand Up @@ -30,7 +30,9 @@ yt
PlotfileTools (see above) can be used instead for axis-aligned slice plots.

The plotfile directory can be loaded with ``yt.load`` as usual. However, the standard fields such as ``('gas', 'density')`` are not defined.
Instead, you have to use non-standard fields. Examine ``ds.field_list`` to see the fields that exist in the plotfiles. These should be: ::
Instead, you have to use non-standard fields. Examine ``ds.field_list`` to see the fields that exist in the plotfiles. These should be:

.. code-block:: python
[('boxlib', 'gasDensity'), ('boxlib', 'gasEnergy'),
('boxlib', 'radEnergy'), ('boxlib', 'scalar'),
Expand Down
4 changes: 2 additions & 2 deletions docs/running_on_hpc_clusters.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Running on HPC clusters
Running on HPC clusters
=====
=======================

Instructions for running on various HPC clusters are given below.

Expand Down Expand Up @@ -35,7 +35,7 @@ Then a single-node test job can be run with: ::
sbatch scripts/setonix-1node.submit

Workaround for interconnect issues
^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If interconnect issues are observed, it is recommended to add the line ::

Expand Down
2 changes: 1 addition & 1 deletion docs/tests/radhydro_pulse.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. Advecting radiation pulse test
Advecting radiation pulse test
=========================
==============================

This test demonstrates the code’s ability to deal with the relativistic
correction source terms that arise from the mixed frame formulation of
Expand Down

0 comments on commit da1931a

Please sign in to comment.