Skip to content

Commit

Permalink
Update docs to reflect process.xrb changes
Browse files Browse the repository at this point in the history
  • Loading branch information
yut23 committed Jan 31, 2024
1 parent 89fbb79 commit 581e932
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 30 deletions.
34 changes: 11 additions & 23 deletions sphinx_docs/source/nersc-hpss.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,31 +19,17 @@ script ``process.xrb`` in ``job_scripts/hpss/``:

:download:`process.xrb <../../job_scripts/hpss/process.xrb>`

which continually looks for output and stores
it to HPSS.
which continually looks for output and stores it to HPSS.
By default, the destination directory on HPSS will be have the same name
as the directory your plotfiles are located in. This can be changed by
editing the``$HPSS_DIR`` variable at the top of ``process.xrb``.

The following describes how to use the scripts:

1. Create a directory in HPSS that has the same
name as the directory your plotfiles are located in
(just the directory name, not the full path). e.g. if you are running in a directory call
``/pscratch/sd/z/zingale/wdconvect/`` run, then do:

.. prompt:: bash

hsi
mkdir wdconvect

.. note::

If the ``hsi`` command prompts you for your password, you will need
to talk to the NERSC help desk to ask for password-less access to
HPSS.

2. Copy the ``process.xrb`` script and the slurm script ``nersc.xfer.slurm``
#. Copy the ``process.xrb`` script and the slurm script ``nersc.xfer.slurm``
into the directory with the plotfiles.

3. Submit the archive job:
#. Submit the archive job:

.. prompt:: bash

Expand Down Expand Up @@ -80,14 +66,16 @@ Some additional notes:
the date-string to allow multiple archives to co-exist.

* When ``process.xrb`` is running, it creates a lockfile (called
``process.pid``) that ensures that only one instance of the script
``process.jobid``) that ensures that only one instance of the script
is running at any one time.

.. warning::

Sometimes if the job is not terminated normally, the
``process.pid`` file will be left behind, in which case, the script
aborts. Just delete that if you know the script is not running.
``process.jobid`` file will be left behind. Later jobs should be
able to detect this and clean up the stale lockfile, but if this
doesn't work, you can delete the file if you know the script is not
running.

Jobs in the xfer queue start up quickly. The best approach is to start
one as you start your main job (or make it dependent on the main
Expand Down
14 changes: 7 additions & 7 deletions sphinx_docs/source/olcf-workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -394,23 +394,23 @@ will also store the inputs, probin, and other runtime generated files. If
Once the plotfiles are archived they are moved to a subdirectory under
your run directory called ``plotfiles/``.

By default, the files will be archived to a directory in HPSS with the same
name as the directory your plotfiles are located in. This can be changed
by editing the ``$HPSS_DIR`` variable at the top of ``process.xrb``.

To use this, we do the following:

#. Enter the HPSS system via ``hsi``

#. Create the output directory -- this should have the same name as the directory
you are running in on summit
To use this, we do the following:

#. Exit HPSS
#. Copy the ``process.xrb`` and ``summit_hpss.submit`` scripts into the
directory with the plotfiles.

#. Launch the script via:

.. prompt:: bash

sbatch summit_hpss.submit

It will for the full time you asked, searching for plotfiles as
It will run for the full time you asked, searching for plotfiles as
they are created and moving them to HPSS as they are produced (it
will always leave the very last plotfile alone, since it can't tell
if it is still being written).
Expand Down

0 comments on commit 581e932

Please sign in to comment.