- Introduction
- Setup
- Tables
- Graphics
- Animations
- QAQC: Making sure there aren't problems with the inputs and outputs
- Reference links
The goal of this file is to provide an overview of the setup and resources required to develop the tables, graphics, and animations provided to King County for the evaluation of nutrient loading impacts. In all of these cases, I was provided with a list of runs to evaluate and was requested to create a particular kind of graphic. Included here is documentation on the code used to create the products. Some work more s moothly than others. They were on a development continuum when funds for the project ran out. The data to create these graphics and tables are not included and would need to be provided by Stefano Mazzilli at Puget Sound Institute. Please email Rachael Mueller with comments, suggestions, and/or corrections to the code or documentation.
- A miniconda environment in which to run scripts and notebooks. My miniconda environment file (jupyter-klone.yaml) looks like:
# virtualenv environment description for a useful jupyter
# environment on Klone
#
# Create a virtualenv containing these packages with:
#
# module load foster/python/miniconda/3.8
# conda env create -f /gscratch/ssmc/USRS/PSI/Rachael/envs/klone-jupyter.yml
#
# To activate this environment use:
# conda activate klone-jupyter
#
# Deactivate with:
# conda deactivate
#
# Delete environment using:
# conda env remove -klone_jupyter
#
name: klone_jupyter
channels:
- conda-forge
- defaults
dependencies:
- pyaml
- cmocean
- jupyterlab
- matplotlib
- scipy
- cartopy
- netCDF4
- xarray
- geopandas
- An Apptainer "Container" in which to run FFMPEG. See, FFMPEG section of HyakOnboarding.md.
- Define run and model output file locations.
- Create SSM_config_whidbey file with file paths and tag names for this set of model runs.
I developed code to create a table of the total nitrogen loading for each scenario based on getting the values directly from input files. Thanks again to Ben Roberts for his help in understanding the file formats and providing better code than what I had cooked up in my first attempt at creating a solution! I haven't yet developed the method into a script and relied on my Table1_NutrientLoadings.ipynb notebook for creating the tables.
Questions were raised about the area, volume, depth, nitrogen sources, etc. for the different regions explored in this project, and I helped answer these questions with Table2_RegionInformation.ipynb and the table it produces.
The code for non-compliance uses a threshold value that can be passed in. The default values for the scenario - reference
difference is -0.25 mg/L, which is equivalent to the Department of Ecology (DOE) rounding method
based on a -0.2 mg/L threshold. See pp. 49 and 50 of Appendix F of Optimization Report Appendix for more details.
- Run process_netcdf.py to generate minimum dissolved oxygen in water column and bottom level. Select:
case = "whidbey"
param="DOXG"
stat_type="min"
The result is minimum, daily DO across all levels. If separate NetCDF are wanted for surface and bottom minimum DO then add "1" to function call and these files will be produced as well, i.e.: python ${py_path}/process_netcdf.py ${file_path} ${param} ${case} ${stat_type} 1 1
The output for the minimum value across the entire water column (wc) will be output to the following subdirectory of ssm['paths']['processed_output']
(in SSM_config_whidbey.ipynb):
whidbey/DOXG/[RUN_TAG]/wc/daily_min_DOXG_wc.nc
Replace wc
above with surface
and bottom
for the paths to those files.
Note: Be sure to update the number of slurm-arrays
used in bash scripts to match the number of scenarios. There is no error-check in this bash script so not all output will be processed if there isn't a sufficient allocation of slurm-arrays
.
- Run the bash script for creating non-compliance table. This bash script calls the python script calc_noncompliance.py
- Change case to
whidbey
inbash_scripts/calc_DO_below_threshold.sh
. This is the only update needed between different cases, because I updated the code to eliminate need to specify readingSSM_config_whidbey.yaml
by hard-coding in the use ofcase
with open('../etc/SSM_config_{case}.yaml', 'r') as file:
ssm = yaml.safe_load(file)
# get shapefile path
shp = ssm['paths']['shapefile']
- I updated code to to use
run_tag
dictionary (ssm['run_information']['run_tag'][whidbey
]) for column names to eliminate need to modify code by hand. (Same upgrade as incalc_DO_noncompliant.py
)
The runtime for creating these spreadsheets (using a slurm array over three nodes, one for each spreadsheet) is: ~16 minutes
File Reference:
Updated calc_noncompliant_timeseries.sh:
#SBATCH --array=0-9
case="whidbey"
run_folders=(
"wqm_baseline"
"wqm_reference"
"3b"
"3e"
"3f"
"3g"
"3h"
"3i"
"3l"
"3m"
)
script_path="/mmfs1/gscratch/ssmc/USRS/PSI/Rachael/projects/KingCounty/SalishSeaModel-analysis/py_scripts/"
Updated python script to assign .yaml file name by case
:
with open(f'../etc/SSM_config_{case}.yaml', 'r') as file:
ssm = yaml.safe_load(file)
# get shapefile path
shp = ssm['paths']['shapefile']
It takes ~6 minutes of computing time to create the spreadsheets.
Similarly updated code for plot_noncompliant_timeseries
, except the shell script for this function call requires run_tag
:
run_tag=(
"baseline"
"reference"
"3b"
"3e"
"3f"
"3g"
"3h"
"3i"
"3l"
"3m"
)
It takes ~0.1 minutes of computing time to create the graphics.
In the interest of time, I gave up one creating an adaptable script and instead modified the Jupyter Notebook plot_noncompliance_timeseries.ipynb for each regional application.
An example of the resulting graphic is below:
File Reference:
- calc_noncompliance_timeseries.sh
- calc_noncompliance_timeseries.py
- plot_noncompliance_timeseries.ipynb Previously used:
- plot_noncompliance_timeseries.py
- plot_5panel_noncompliant_timeseries.sh
- plot_5panel_noncompliant_timeseries.py
The method for creating this graphic evolved as I developed the ability to read loading quantities directly from the input files. The three regions and corresponding notebooks are:
These plots are still done in a Jupyter Notebook that requires quite a bit of manual editing.
See plot_nutrient_loading_whidbey.ipynb. Examples of the resulting graphics are shown below. Grey shading between the 2014
and reference
conditions is used to highlight changes over time.
All animations are created using the software ffmpeg
through an Apptainer Container, on Hyak. It wasn't obvious to me how to control the runtime, and I needed to do a bit of online searching and experimenting to figure out a solution. What I found was that the -r
flag didn't do anything. The solution that worked for my setup was the -framerate
flag, e.g.:
apptainer exec --bind ${graphics_dir} --bind ${output_dir} ~/ffmpeg.sif ffmpeg -start_number 6 -framerate 6 -i ${graphics_dir}${case}_${run_tags[${SLURM_ARRAY_TASK_ID}]}_${param}_${stat_type}_conc_${loc}_%d_whidbeyZoom.png -vcodec mpeg4 ${output_dir}${case}_${run_tags[${SLURM_ARRAY_TASK_ID}]}_${param}_${stat_type}_${loc}_whidbeyZoom.mp4
Here, I use -framerate 6
to get a minute-long movie by incorporating 6 images per second from a pool of ~366 images. Including -vcodec mpeg4
was neeccessary for me to get the product to play on my macOS Monterey. We ran into trouble playing the output on a PC and worked around this problem by saving to .avi
before finding the problem was on the PC side. In the process of troubleshooting, I also read that adding -c:v libx264 -pix_fmt yuv420p
can make the .mp4
more broadly accessible, but I haven't yet received confirmation that this is an important/neccessary specification.
All animations require a NetCDF of whatever variable (DOXG, salinity, NO3, etc.) and statistic (min, max, mean, etc.) that is being represented. Theses are created using: process_netcdf.sh. These NetCDF only have to be created once, but I include this as a first step for all movies listed below so that the instructions are "stand alone." If this step is done for one movie it does not need to be repeated for another. Process_netcdf.sh saves desired concentration information to NetCDF files stored in /mmfs1/gscratch/ssmc/USRS/PSI/Rachael/projects/KingCounty/data/{case}/{param}/{SCENARIO_NAME}/{LOC}
where:
- {case}
is SOG_NB, whidbey or main,
- {param}
is the SSM output parameter name (e.g. DOXG),
- {SCENARIO_NAME}
is the run-tag used on HYAK (e.g. 3m
), and
- {LOC}` is either wc (water column), bottom, or surface.
- process_netcdf.sh: Creates a NetCDF of the model output that only includes information for the desired variable (e.g.
DOXG
). A file is automatically created for all 3D values and can be created for surface-only and bottom-only values using thesurface_flag
and/orbottom_flag
when calling process_netcdf.py. Use process_netcdf.sh for a shell script to callprocess_netcdf.py
with the desired setup. The way I have the file structure setup, the files are saved to, e.g.:/mmfs1/gscratch/ssmc/USRS/PSI/Rachael/projects/KingCounty/data/{case}/{param}/{SCENARIO_NAME}/{LOC}
where:{case}
is SOG_NB, whidbey or main,{param}
is the SSM output parameter name (e.g. DOXG),{SCENARIO_NAME}
is the run-tag used on HYAK (e.g.3m
), and- {LOC}` is either wc (water column), bottom, or surface.
- plot_conc_graphics_for_movies.sh: Uses output from process_netcdf.sh to create daily plots of concentration maps for either
FullDomain
orRegional
view. TheRegional
view uses region names in the shapefile and only plots the concentrations for the unmasked nodes within the given region. TheFullDomain
graphics show values for masked and unmasked nodes. This script requires patience (~15 minutes). Daily graphics are saved to, e.g.:
/mmfs1/gscratch/ssmc/USRS/PSI/Rachael/projects/KingCounty/data/whidbey/DOXG/concentration/movies/FullDomain/wc/3b/whidbey_3b_DOXG_min_conc_wc_1.png
- create_conc_movies.sh. Uses a FFMPEG apptainer to compile individual graphic outputs from
plot_conc_graphics_for_movies.sh
into a .mp4 movie. This script goes quickly. There is an optionFullDomain
orRegion
. The script will no over-write existing movies.
The code for non-compliance uses a threshold value that can be passed in. The default values for the scenario - reference
difference is -0.25 mg/L, which is equivalent to the Department of Ecology (DOE) rounding method
based on a -0.2 mg/L threshold. See pp. 49 and 50 of Appendix F of Optimization Report Appendix for more details.
Salish Sea Model nitrogen inputs are in units of concentration but some of our runs required altering loading. In these runs, I needed to scale concentrations appropriately in order to accurately change the loading. These graphics reflect my internal QAQC to ensure that I scaled the nitrogen levels correctly and as requested.
- Validating the nutrient input loadings for Main region: validate_SSM_input_loading_main.ipynb
- Validating the nutrient input loadings for Whidbey region: validate_SSM_input_loading.ipynb
- Histograms of DO difference between 2014 and scenario QAQC_DeltaDO_DeltaNO3_MainRegion.ipynb.
- Comparing normalized nitrogen loading to normalized noncompliance. Other models that I've worked with have crashed when solutions become infinite but that's not the case with this version of ICM. The way that I learned this was to notice outliers in plots that compare normalized noncompliance to normalized nitrogen. See cases
Wtp1
(typo forMtp1
) andMtp2
in below figure.
The following files are not public and require access permission through the Puget Sound Institute.
- Municipal model runs and scripting task list.xlsx (Internal PSI document)
- Whidbey configuration file
- Whidbey_Figures&Tables.xlsx (Internal PSI document)
- SOG_NB_Figures&Tables.xlsx (Internal PSI document)
- Main_Figures&Tables (Internal PSI document)