Skip to content

Commit

Permalink
Merge pull request #310 from JCSDA-internal/develop
Browse files Browse the repository at this point in the history
Publish recent changes to master
  • Loading branch information
mer-a-o authored Sep 24, 2021
2 parents 2abd7d6 + 42ed9a6 commit 93c1172
Show file tree
Hide file tree
Showing 61 changed files with 4,155 additions and 170 deletions.
1 change: 1 addition & 0 deletions docs/inside/jedi-components/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ JEDI Components
ioda/index.rst
ufo/index.rst
fv3-jedi/index.rst
mpas-jedi/index.rst
configuration/index.rst
130 changes: 130 additions & 0 deletions docs/inside/jedi-components/ioda/file-formats.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
.. _top-ioda-file-formats:

IODA File Formats
=================

Overview
--------

IODA can read files in the following formats:

* HDF5
* ODB

and write files in the following formats:

* HDF5.

The following sections describe how these formats are handled from the user's point of view.

HDF5
----

To read an HDF5 file into an ``ObsSpace``, it is enough to set the ``obs space.obsdatain.obsfile`` option in the YAML configuration file to the HDF5 file path. For example,

.. code-block:: YAML
obs space:
obsdatain:
obsfile: Data/testinput_tier_1/sondes_obs_2018041500_m.nc4
Similarly, to write the contents of an ``ObsSpace`` to disk at the end of the observation processing pipeline, use the ``obs space.obsdataout.obsfile`` option:

.. code-block:: YAML
obs space:
obsdataout:
obsfile: obsfile: Data/sondes_obs_2018041500_m_out.nc4
Each MPI rank will then write its observations to a separate file with the name obtained by inserting the rank before the extension of the file name taken from the ``obs space.obsdataout.obsfile`` option. In the example above, processes 0 and 1 would produce files called ``Data/sondes_obs_2018041500_m_out_0000.nc4`` and ``Data/sondes_obs_2018041500_m_out_0001.nc4``, respectively (assuming observations were split across ranks only in space; if they were split also in time, file names would have an extra suffix with the index of the time partition).

ODB
---

.. note::

To be able to read ODB files, ``ioda`` needs to be built in an environment providing access to ECMWF's ``odc`` library. All of the development containers (Intel, GNU and Clang) include this library.

To read an ODB file into an ``ObsSpace``, three options need to be set in the ``obs space.obsdatain`` section of the YAML configuration file:

* ``obsfile``: the path to the ODB file;
* ``mapping file``: the path to a YAML file mapping ODB column names and units to IODA variable names;
* ``query file``: the path to a YAML file defining the parameters of an SQL query selecting the required data from the ODB file.

The syntax of the mapping and query files is described in the subsections below. The ``ioda`` repository contains sample mapping and query files that should be sufficient for most needs. There is a single mapping file, ``test/testinput/odb_default_name_map.yml``, and one query file per observation type, e.g. ``test/testinput/iodatest_odb_aircraft.yml`` for aircraft observations and ``test/testinput/iodatest_odb_atms.yml`` for ATMS observations. For example, a YAML file used for aircraft data processing could contain the following ``obs space.obsdatain`` section:

.. code-block:: YAML
obs space:
obsdatain:
obsfile: Data/testinput_tier_1/aircraft.odb
mapping file: testinput/odb_default_name_map.yml
query file: testinput/iodatest_odb_aircraft.yml
Mapping files
"""""""""""""

Here's an example ODB mapping file:

.. code-block:: YAML
ioda:
variables:
- name: MetaData/latitude
source: lat
- name: MetaData/longitude
source: lon
- name: ObsValue/relative_humidity
source: 29
unit: percentage
- name: ObsValue/surface_pressure
source: 110
unit: hectopascal
complementary variables:
- input names: [site_name_1, site_name_2, site_name_3, site_name_4]
output name: MetaData/station_id
The top-level section ``ioda`` is required. The ``ioda.variables`` section is optional (but typically needed); if present, it must be a list of items defining the mapping of individual ODB columns to ``ioda`` variables. Within each item, the following keys are recognized:

* ``source`` (required): name of an ODB column or numeric identifier of a geophysical variable (see https://apps.ecmwf.int/odbgov/varno for the full list);

* ``name`` (required): name of the corresponding ``ioda`` variable;

* ``unit`` (optional): name of the unit used in the ODB file. If specified, values loaded from the ODB file will be converted to the unit used in ``ioda`` (typically a basic SI unit). Currently the following units are supported: ``celsius``, ``knot``, ``percentage`` (converted to a fraction), ``okta`` (1/8 -- converted to a fraction), ``degree`` (converted to radians) and ``hectopascal`` (converted to pascals).

The ``ioda.complementary variables`` section is also optional; if present, it must be a list of items defining groups of ODB text columns that should be merged into single ``ioda`` variables. This merging is required because entries of ODB text columns are limited to 8 characters each. Within each item, the following keys are recognized:

* ``input names`` (required): ordered list of names of ODB columns that should be merged;
* ``output name`` (required): name of the ``ioda`` variable that will hold the contents of the merged columns;
* ``output variable data type`` (optional): if present, must be set to ``string``;
* ``merge method`` (optional): if present, must be set to ``concat``.

Certain variables are handled in a special way. Columns for date and time (``date``, ``time``, ``receipt_date``, ``receipt_time``) are not specified in the mapping file; instead they are converted into the string date/time representations used by ``ioda`` and stored in ``ioda`` variables ``MetaData/datetime`` and ``MetaData/receiptdatetime``. They still need to be provided in the ``variables`` list in the query file.

Query files
"""""""""""

The following ODB query file

.. code-block:: YAML
variables:
- name: lat
- name: lon
- name: flight_phase
- name: initial_obsvalue
- name: varno
where:
varno: [2,111,112]
corresponds to the following SQL query:

.. code-block:: SQL
SELECT lat, lon, flight_phase, initial_obsvalue, varno
FROM <ODB file name>
WHERE (varno = 2 OR varno = 111 OR varno = 112);
This is the query used to retrieve data from the input ODB file. The names of the specified columns are converted to ``ioda`` variable names when the ObsSpace object is constructed.

In general, a query file must contain a ``where`` section with the ``varno`` key set to the list of identifiers of the geophysical variables of interest (see https://apps.ecmwf.int/odbgov/varno for the full list). In addition, it can contain an optional ``variables`` list; the ``name`` key in each item in this list is the name of a column to be retrieved from the ODB file.
3 changes: 2 additions & 1 deletion docs/inside/jedi-components/ioda/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ IODA is the **Interface for Observational Data Access**. This is the component
These documents give a high-level overview of the IODA code repository. A low-level description of the classes, functions, and subroutines is also available, produced by means of the `Doxygen document generator <https://www.doxygen.nl/index.html>`_.

+----------------------------------------------------------------------------------------+
| `Doxygen Documentation <http://data.jcsda.org/doxygen/Release/1.1.0/ioda/index.html>`_ |
| `Doxygen Documentation <http://data.jcsda.org/doxygen/Release/ioda/2.1.0/index.html>`_ |
+----------------------------------------------------------------------------------------+


Expand All @@ -18,3 +18,4 @@ These documents give a high-level overview of the IODA code repository. A low-l
introduction
details
interface
file-formats
153 changes: 153 additions & 0 deletions docs/inside/jedi-components/mpas-jedi/build.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
.. _top-mpas-jedi-build:

Building and Testing MPAS-JEDI
==============================

This section describes how to build MPAS-JEDI using CMake, then confirm that your build is working
properly with CTest. Usage of CMake and CTest are described in the :doc:`JEDI CMake, CTest, and
ecbuild </inside/developer_tools/cmake>` documentation.

MPAS-BUNDLE
-----------

In order to build MPAS-JEDI and its dependencies, it is recommended to use MPAS-BUNDLE, available at
https://github.com/JCSDA/mpas-bundle. Within MPAS-BUNDLE, the file named :code:`CMakeLists.txt`
controls the dependency chain of components that are either essential (e.g., OOPS, IODA, UFO, SABER,
and CRTM) or optional (e.g., RTTOV) to the procedure that eventually generates MPAS-JEDI
executables. MPAS-BUNDLE is built using :code:`ecbuild`. Full details on how to build any JEDI
bundle are provided :doc:`elsewhere </using/building_and_running/building_jedi>`, and it is
recommended to familiarize yourself with those instructions before continuing here.

.. _build-test-mpas-cheyenne:

Building and testing MPAS-BUNDLE on Cheyenne
--------------------------------------------

Most development and testing of MPAS-JEDI has been performed on NCAR's Cheyenne HPC
system. Custom scripts for creating the required build environment on Cheyenne are provided
in MPAS-BUNDLE. After cloning MPAS-BUNDLE from Github, you can find these scripts in
:code:`mpas-bundle/env-setup`. Before executing the :code:`ecbuild` command, :code:`source`
the script appropriate for your choice of compiler, MPI implementation, and shell (e.g.,
gnu-openmpi-cheyenne.sh). The commands in the environment script are consistent with the
instructions for Cheyenne under :ref:`top-modules`.

After building MPAS-BUNDLE, it is recommended to run the ctests. Passing this suite of tests
confirms that your build is working as expected.

Starting from a project directory such as :code:`$HOME/jedi`, the entire build and test workflow
on Cheyenne would look like:

.. code-block:: bash
git clone https://github.com/JCSDA/mpas-bundle.git # this creates the 'mpas-bundle' directory
source mpas-bundle/env-setup/<desired environment script>
mkdir ./<build-directory>
cd ./<build-directory>
ecbuild ../mpas-bundle
make update
make -j4
cd mpas-jedi
ctest
Notes about building on Cheyenne:
- The :code:`gnu-openmpi` environment has been more extensively tested than the :code:`intel-impi`
environment on Cheyenne
- The :code:`<build-directory>` cannot be the directory named :code:`mpas-bundle`, where the
repository is cloned, because doing so will create conflict between the source code
directory and the CMake-generated build sub-directories
- Users can expect the above build and test procedure to take approximately 45 minutes. For some
speedup, it is recommended to execute the :code:`make` step in a job script with :code:`-j16`
instead of :code:`-j4`, which will use 16 processors instead of 4 in the parallel build. Much of
the total time spent is during the :code:`ecbuild` step, which downloads the code for the first
time.

Building MPAS-BUNDLE in Singularity
-----------------------------------

MPAS-BUNDLE can also be built and tested in the :doc:`JEDI development Singularity container
</using/jedi_environment/singularity>`. Detailed instructions are provided at that link. If you
do not plan to or are unable to install Singularity natively, you may be interested to learn
:doc:`how to launch a Singularity container in a Vagrant Virtual Machine
</using/jedi_environment/vagrant>`. When working in the Singularity container, the main difference
from the instructions provided above for Cheyenne is that the environment is already set up properly
within the container. Thus there is no need to :code:`source` an environment setup file.

.. _controltesting-mpas:


Built executables
-----------------

After completing the MPAS-BUNDLE build, users have access to many executables under
:code:`<build-directory>/bin`, many of which are generated when building the projects on which
MPAS-JEDI is dependent (OOPS, UFO, SABER). The executables that are relevant to MPAS are as
follows, grouped separately for MPAS-A and MPAS-JEDI.

MPAS-A
""""""
- :code:`mpas_atmosphere`: can be used interchangeably with the :code:`atmosphere_model` executable
that would normally be built using the non-JEDI (standalone) MPAS-Model build mechanism for
the :code:`atmosphere` core. Its purpose is to integrate the model forward in time from an
initial time to a final time with periodic IO of model fields of importance.
- :code:`mpas_init_atmosphere`: can be used interchangeably with the :code:`init_atmosphere_model` executable that would normally be built using the non-JEDI (standalone) MPAS-Model build
mechanism for the :code:`init_atmosphere` core. Its purpose is to generate cold-start initial
condition and surface input files.

MPAS-JEDI
"""""""""
Each of these executables are model-specific implementations of generic applications that
are derived from the :code:`oops::Application` class, i.e.,
:code:`oops/src/oops/runs/Application.h`. Descriptions of the generic applications are located under
the :doc:`OOPS Applications </inside/jedi-components/oops/applications/index>` documentation. Here
we give short synopses of a few specific MPAS-JEDI implementations.

- Applications with one initial state

- :code:`mpasjedi_convertstate.x` (:code:`oops::ConvertState`)
- :code:`mpasjedi_dirac.x` (:code:`oops::Dirac`)
- :code:`mpasjedi_forecast.x` (:code:`oops::Forecast`): essentially does the same as the
:code:`mpas_atmosphere` executable, but through the JEDI generic framework via the MPAS-JEDI
interface. There is more overhead than when running the non-JEDI exectuable, and this
requires a YAML file in addition to the standard :code:`namelist.atmosphere` used to configure
:code:`mpas_atmosphere`.
- :code:`mpasjedi_gen_ens_pert_B.x` (:code:`oops::GenEnsPertB`)
- :code:`mpasjedi_hofx.x` (:code:`oops::HofX4D`)
- :code:`mpasjedi_hofx3d.x` (:code:`oops::HofX3D`)
- :code:`mpasjedi_parameters.x` (:code:`saber::EstimateParams`): used to estimate static
background error covariance and localization matrices
- :code:`mpasjedi_staticbinit.x` (:code:`oops::StaticBInit`)
- :code:`mpasjedi_variational.x` (:code:`oops::Variational`): carries out many different
flavors of variational data assimilation (3DVar, 3DEnVar, 3DFGAT, 4DEnVar) with a variety of
incremental minimization algorithms

- Applications with multiple initial states

- :code:`mpasjedi_eda.x` (:code:`oops::EnsembleApplication<oops::Variational>`)
- :code:`mpasjedi_enshofx.x` (:code:`oops::EnsembleApplication<oops::HofX4D>`)
- :code:`mpasjedi_rtpp.x` (:code:`oops::RTPP`): standalone application that carries out
Relaxation to Prior Perturbation, as introduced by Zhang et al. (2004). The intended purpose
is to inflate the analysis ensemble spread after running the EDA application.



Most of the MPAS-JEDI executables are exercised in ctests. As users learn how to use MPAS-JEDI for
larger-scale applications, it is useful to consider the ctests as examples and templates. For more
information on the individual ctests, see :doc:`the documentation for their yaml configuration files
</inside/jedi-components/mpas-jedi/data>`.



Controlling the testing
-----------------------

In addition to the basic :code:`ctest` command shown in :ref:`build-test-mpas-cheyenne`, which runs
all of the available tests for MPAS-JEDI, :code:`ctest` has basic flags and arguments available for
selecting a subset of tests. :code:`ctest` also automatically provides some logging functionality
that is useful for reviewing passing and failing test cases. Both of those aspects of
:code:`ctest` are described in more detail within the :doc:`JEDI Developer Tools
</inside/developer_tools/cmake>` and :doc:`JEDI Testing </inside/testing/unit_testing>`
documentations.

References
----------
Zhang, F., C. Snyder, and J. Sun (2004): Impacts of initial estimate and observation availability on convective-scale data assimilation with an ensemble Kalman filter. Mon. Wea. Rev., 132, 1238–1253
Loading

0 comments on commit 93c1172

Please sign in to comment.