-
Notifications
You must be signed in to change notification settings - Fork 4
Introduction to YAML
YAML stands for ‘YAML Ain’t Markup Language’, it’s a markup language similar to XML, etc. that allows for configuration in a controlled format in plain text files. JEDI uses YAML for all of its configuration files and this format is becoming increasingly important for atmospheric science applications.
Information about YAML file format and some useful websites/references:
https://blog.stackpath.com/yaml/
https://docs.ansible.com/ansible/latest/reference_appendices/YAMLSyntax.html
http://www.yamllint.com/ - a YAML web validator (check syntax)
In the next two sections, we will take a look at a sample YAML file for a test from two different components and briefly explain what each line means/represents. This is by no means an exhaustive discussion, but will serve to show how to work with the YAML files, and how there are similarities and differences between different applications.
First, let’s start with something simple. One of the shortest YAML test files is for the aircraft observation operator tests, fv3-bundle/ufo/test/testinput/aircraft.yaml.
1 window begin: '2018-04-14T20:30:00Z'
2 window end: '2018-04-15T03:30:00Z'
3
4 observations:
5 - obs space:
6 name: Aircraft
7 obsdatain:
8 obsfile: Data/ioda/testinput_tier_1/aircraft_obs_2018041500_m.nc4
9 simulated variables: [air_temperature,specific_humidity]
10 #simulated variables: [eastward_wind, northward_wind]
11 obs operator:
12 name: VertInterp
13 linear obs operator test:
14 coef TL: 0.1
15 tolerance TL: 1.0e-13
16 tolerance AD: 1.0e-11
17 geovals:
18 filename: Data/ufo/testinput_tier_1/aircraft_geoval_2018041500_m.nc4
19 vector ref: GsiHofX
20 tolerance: 1.0e-6
The above lines are numbered, and each will be described below:
- Window begin is the string of the beginning timestamp of the assimilation window
- Window end is the string of the ending timestamp of the assimilation window
- The high level observations section is used to describe all of the observation types to be used
- In YAML, a ‘-’ indicates an item in a list, so this is listing each observation type. For this file, there is only one in the list
- name of the obs space to reference in memory (can really be anything you want)
- obsdatain: this section will describe the input
- path to input IODA observation file
- list of variables/observations to simulate
- obs operator: this section will setup the UFO that you wish to use
- name of the UFO, this must match what is is in the source code/factory
- this section defines parameters for the linear operator tests
- coefficient of the perturbation to apply in the tangent linear
- tolerance of the tangent linear test
- tolerance of the adjoint test
- geovals: this section is for the pre interpolated model fields for these unit tests
- path to the interpolated GeoVaLs
- vector ref: this tells the test executable what to use as a reference for the H(x) calculation
- tolerance of the test comparing output from UFO H(x) and the GSI H(x)
This YAML file is used by the following ctests:
test_ufo_vertinterp_aircraft_opr
test_ufo_linopr_vertinterp_aircraft
Note that this is a very simple YAML configuration file, and that most will be longer and have more components. Thus, we will look at an example from FV3-JEDI to see a different perspective. The practicals later in this document will feature more comprehensive YAML files.
Building from this example which runs the vertical interpolation observation operator for aircraft observations, let us look at an example FV3-JEDI YAML file that runs the fv3jedi_hofx_nomodel.x application to produce H(x), fv3-jedi/test/testinput/hofx_nomodel.yaml.
1 window begin: '2018-04-14T21:00:00Z'
2 window length: PT6H
3 forecast length: PT6H
4 geometry:
5 nml_file_mpp: Data/fv3files/fmsmpp.nml
6 trc_file: Data/fv3files/field_table
7 akbk: Data/fv3files/akbk64.nc4
8 # input.nml
9 layout: [1,1]
10 io_layout: [1,1]
11 npx: 13
12 npy: 13
13 npz: 64
14 ntiles: 6
15 fieldsets:
16 - fieldset: Data/fieldsets/dynamics.yaml
17 - fieldset: Data/fieldsets/ufo.yaml
18 forecasts:
19 #state:
20 filetype: gfs
21 datapath: Data/inputs/gfs_c12/bkg/
22 filename_core: 20180415.000000.fv_core.res.nc
23 filename_trcr: 20180415.000000.fv_tracer.res.nc
24 filename_sfcd: 20180415.000000.sfc_data.nc
25 filename_sfcw: 20180415.000000.fv_srf_wnd.res.nc
26 filename_cplr: 20180415.000000.coupler.res
27 state variables: [u,v,ua,va,T,DELP,sphum,ice_wat,liq_wat,o3mr,phis,
28 slmsk,sheleg,tsea,vtype,stype,vfrac,stc,smc,snwdph,
29 u_srf,v_srf,f10m,sss]
30 observations:
31 - obs space:
32 name: Aircraft
33 obsdatain:
34 obsfile: Data/obs/testinput_tier_1/aircraft_obs_2018041500_m.nc4
35 obsdataout:
36 obsfile: Data/hofx/aircraft_hofx_gfs_2018041500_m.nc4
37 simulated variables: [eastward_wind,northward_wind,air_temperature]
38 obs operator:
39 name: VertInterp
40 - obs space:
41 name: AMSUA-NOAA19
42 obsdatain:
43 obsfile: Data/obs/testinput_tier_1/amsua_n19_obs_2018041500_m.nc4
44 obsdataout:
45 obsfile: Data/hofx/amsua_n19_hofx_gfs_2018041500_m.nc4
46 simulated variables: [brightness_temperature]
47 channels: 1-15
48 obs operator:
49 name: CRTM
50 Absorbers: [H2O,O3,CO2]
51 Clouds: [Water, Ice]
52 Cloud_Fraction: 1.0
53 obs options:
54 Sensor_ID: amsua_n19
55 EndianType: little_endian
56 CoefficientPath: Data/crtm/
57 prints:
58 frequency: PT3H
Again, the above lines are numbered, and will be summarized below.
-
Window begin, same as the UFO example, the start of the assimilation window
-
Window length, this time we specify a 6 hour assimilation window
-
Forecast length, the background used as input is FH006 4-17. The model geometry for FV3 is defined here.
5 - Path to the FMS/MPP namelist file
6 - Path to the FV3 field table file
7 - Path to the FV3 AK/BK hybrid coordinate netCDF input file
9 - the MPI layout for each FV3 cubed-sphere tile
10 - the IO layout for each FV3 cubed-sphere tile
11-14 - FV3 grid size definitions
15-17. List of fieldsets (YAML files that describe which input FV3 tile file contains each model field, the field’s units, etc.) 18-29. This section describes the input model forecast.
-
gfs or geos?
-
path to input model restart files
22-26. list the filenames for the core, trcr, sfcd, sfcw, cplr restart files
27-29. list of the state variables to read in from the restart files
30-56. Like before, this is the observations section, defining all observations
31-39. Aircraft observations are defined here. This will be very similar to the UFO example with a few exceptions. 35-36 are obsdataout/obsfile, this is the path to an output file that will write the H(x) values (and other things). Also, any missing lines that were in the UFO example (the tolerances, geovals, etc.) are not needed for anything besides the simple unit tests, like in a ‘real’ case such as this one to compute H(x).
40-56. This is an example section of how to simulate AMSU-A brightness temperatures. Line 47 specifies the channels (can be a range like 1-15 or something like 1-15,17-20,22,24-30). Unlike the vertical interpolation observation operator, the CRTM operator requires some configuration options to be set (lines 50-56). Things like the list of absorbers (50), clouds (51), and the parameters for the correct CRTM coefficients (54-56) must be defined.
57-58. This just specifies that output is printed every 3 hours of model time (not really used if there isn’t a model being integrated or no First-Guess at Appropriate Time (FGAT).