Skip to content

Emissions Report Minimum INformation (ERMIN) Standards

License

Notifications You must be signed in to change notification settings

knights-lab/ermin-standards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Emissions Report Minimum INformation (ERMIN) Standards

Background

This repository contains template files describing recommended minimum information, or metadata fields, that should be provided with any emissions report from any sector or using any observation method, as well as suggested optional metadata fields that may be included from only certain sectors or observation types. The templates also give a specification of the data format for the different metadata fields.

The repository also includes software for validation and correction of data tables.

Templates

The primary XLSX-formatted specification can be found here.

Installation

  • Download this repo (zipped version or, better, clone with git)
  • Permanently add the full path to its top-level directory to the system environment variable PYTHONPATH. After reopening your terminal, you should now be able to run import ermin from a Python interpreter anywhere on your system.
  • Optionally, permanently add the full path to the bin directory to the system environment variable PATH. This will allow you to run ERMIN executables from anywhere on your system (i.e. just run validate_ermin_table.py). Otherwise, run with python /path/to/repo/bin/validate_ermin_table.py.

Requirements:

  • python version >= 3.6
  • validators package
  • pandas package
  • pytest package for running unit tests
  • Mac or Linux (may work on Windows, but not tested).

Usage

To use as a module:

Validate a Pandas dataframe:

import ermin.validation as ev
import pandas as pd

# load CSV into a Pandas DataFrame
# Notes: use comment='#' to skip comment lines
#        use keep_default_na=False to ensure that empty fields remain strings
#        optionally, use dtype = str to keep all fields as strings
df = pd.read_csv('test/testdata/testinput3.csv', comment='#', keep_default_na=False)

# Example 1: Validate DataFrame using provided spec file, returning a new repaired DataFrame
warnings, errors, newdf = ev.check_input_dataframe(df,spec_file='templates/ermin-specification.csv')

# Example 2: Validate DataFrame using provided spec file, saving repaired table to file
warnings, errors, newdf = ev.check_input_dataframe(df,spec_file='templates/ermin-specification.csv', output_file='t1-fix.csv')

Validate a CSV file and save repaired table to a new file:

import ermin.validation as ev

# Validate file and write a repaired version to new file
warnings, errors = check_input_file('test/testdata/testinput1.csv, 'templats/ermin-specification.csv', output_file='t1-fix.csv')

To use from the command line:

Print instructions with:

python validate_ermin_table.py -h

Check test file, replace all missing values with NULL:

python validate_ermin_table.py -s ermin-specification.csv -i test/testdata/testinput1.csv -o t1-fix.csv -v -a

Check two test files, replace all missing values with NULL:

python bin/validate_ermin_table.py -s templates/ermin-specification.csv -i test/testdata/testinput1.csv -o t1-fix.csv -v -a
python bin/validate_ermin_table.py -s templates/ermin-specification.csv -i test/testdata/testinput2.csv -o t2-fix.csv -v -a

Testing

Run unit tests from within the test folder. Requires pytest package.

pytest

About

Emissions Report Minimum INformation (ERMIN) Standards

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages