Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reader for GERB high-resolution HDF5 files #2572

Merged
merged 22 commits into from
Oct 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions satpy/etc/readers/gerb_l2_hr_h5.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
reader:
name: gerb_l2_hr_h5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a not "HR" version of these files? Are these "official" products? Are there other L2 products that aren't HDF5? I'm wondering if we could just call this reader gerb_l2 or gerb_l2_h5 or something shorter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, the L2 HR (9km) product is the one we recommend to users. We have a so-called ARG L1.5 (50 km) product and a BARG L2 (50km). Those are the official products of the GERB instrument (RAL Space, Imperial College and the Royal Meteorological Institute of Belgium produce the data on behalf of EUMETSAT).

There might be an issue if/when we produce "edition 2" data in the future that might be in NetCDF.

I think that gerb_l2_h5 should be fine though :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. If there is a chance of NetCDF versions in the future then maybe the h5 is the best for now. We could always try to throw support for all the files in a generic gerb_l2 reader, but maybe that's a little too optimistic about the similarities of the files (especially when some of them don't exist yet).

short_name: GERB HR
long_name: Meteosat Second Generation Geostationary Earth Radiation Budget L2 High-Resolution
description: Reader for the HR product of the Geostationary Earth Radiation Budget instrument
status: Beta
supports_fsspec: false
reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader
sensors: [gerb]

file_types:
gerb_l2_hr_h5:
file_reader: !!python/name:satpy.readers.gerb_l2_hr_h5.GERB_HR_FileHandler
file_patterns: ['{sensor_name}_{seviri_name}_L20_HR_SOL_TH_{sensing_time:%Y%m%d_%H%M%S}_{gerb_version}.hdf']

datasets:
Solar_Flux:
name: Solar Flux
sensor: gerb
units: W m-2
fill_value: -32767
standard_name: toa_outgoing_shortwave_flux
file_type: gerb_l2_hr_h5

Thermal_Flux:
name: Thermal Flux
sensor: gerb
units: W m-2
fill_value: -32767
standard_name: toa_outgoing_longwave_flux
file_type: gerb_l2_hr_h5

Solar_Radiance:
name: Solar Radiance
sensor: gerb
units: W m-2 sr-1
fill_value: -32767
file_type: gerb_l2_hr_h5

Thermal_Radiance:
name: Thermal Radiance
sensor: gerb
units: W m-2 sr-1
fill_value: -32767
file_type: gerb_l2_hr_h5
89 changes: 89 additions & 0 deletions satpy/readers/gerb_l2_hr_h5.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2023
#
# This file is part of satpy.
#
# satpy is free software: you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# satpy is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# satpy. If not, see <http://www.gnu.org/licenses/>.


"""GERB L2 HR HDF5 reader.

A reader for the Top of Atmosphere outgoing fluxes from the Geostationary Earth Radiation
Budget instrument aboard the Meteosat Second Generation satellites.
"""


import logging
from datetime import timedelta

from satpy.readers.hdf5_utils import HDF5FileHandler
from satpy.resample import get_area_def

LOG = logging.getLogger(__name__)


def gerb_get_dataset(ds, ds_info):
"""
Load a GERB dataset in memory from a HDF5 file or HDF5FileHandler.

The routine takes into account the quantisation factor and fill values.
"""
ds_attrs = ds.attrs
ds_fill = ds_info['fill_value']
fill_mask = ds != ds_fill
if 'Quantisation Factor' in ds_attrs and 'Unit' in ds_attrs:
ds = ds*ds_attrs['Quantisation Factor']
else:
ds = ds*1.

Check warning on line 48 in satpy/readers/gerb_l2_hr_h5.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/gerb_l2_hr_h5.py#L48

Added line #L48 was not covered by tests
ds = ds.where(fill_mask)
return ds


class GERB_HR_FileHandler(HDF5FileHandler):
"""File handler for GERB L2 High Resolution H5 files."""

@property
def end_time(self):
"""Get end time."""
return self.start_time + timedelta(minutes=15)

@property
def start_time(self):
"""Get start time."""
return self.filename_info['sensing_time']

def get_dataset(self, ds_id, ds_info):
"""Read a HDF5 file into an xarray DataArray."""
ds_name = ds_id['name']
if ds_name not in ['Solar Flux', 'Thermal Flux', 'Solar Radiance', 'Thermal Radiance']:
raise KeyError(f"{ds_name} is an unknown dataset for this reader.")

Check warning on line 70 in satpy/readers/gerb_l2_hr_h5.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/gerb_l2_hr_h5.py#L70

Added line #L70 was not covered by tests

ds = gerb_get_dataset(self[f'Radiometry/{ds_name}'], ds_info)

ds.attrs.update({'start_time': self.start_time, 'data_time': self.start_time, 'end_time': self.end_time})

return ds

def get_area_def(self, dsid):
"""Area definition for the GERB product."""
ssp_lon = self.file_content["Geolocation/attr/Nominal Satellite Longitude (degrees)"]

if abs(ssp_lon) < 1e-6:
return get_area_def("msg_seviri_fes_9km")
elif abs(ssp_lon - 9.5) < 1e-6:
return get_area_def("msg_seviri_fes_9km")
elif abs(ssp_lon - 45.5) < 1e-6:
return get_area_def("msg_seviri_iodc_9km")

Check warning on line 87 in satpy/readers/gerb_l2_hr_h5.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/gerb_l2_hr_h5.py#L84-L87

Added lines #L84 - L87 were not covered by tests
else:
raise ValueError(f"There is no matching grid for SSP longitude {self.ssp_lon}")

Check warning on line 89 in satpy/readers/gerb_l2_hr_h5.py

View check run for this annotation

Codecov / codecov/patch

satpy/readers/gerb_l2_hr_h5.py#L89

Added line #L89 was not covered by tests
129 changes: 129 additions & 0 deletions satpy/tests/reader_tests/test_gerb_l2_hr_h5.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
#!/usr/bin/env python

Check warning on line 1 in satpy/tests/reader_tests/test_gerb_l2_hr_h5.py

View check run for this annotation

CodeScene Delta Analysis / CodeScene Cloud Delta Analysis (main)

❌ New issue: Large Method

gerb_l2_hr_h5_dummy_file has 72 lines, threshold = 70. Large functions with many lines of code are generally harder to understand and lower the code health. Avoid adding more lines to this function.
# -*- coding: utf-8 -*-
# Copyright (c) 2018 Satpy developers
#
# This file is part of satpy.
#
# satpy is free software: you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# satpy is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# satpy. If not, see <http://www.gnu.org/licenses/>.
"""Unit tests for GERB L2 HR HDF5 reader."""

import h5py
import numpy as np
import pytest

from satpy import Scene

FNAME = "G4_SEV4_L20_HR_SOL_TH_20190606_130000_V000.hdf"


def make_h5_null_string(length):
"""Make a HDF5 type for a NULL terminated string of fixed length."""
dt = h5py.h5t.TypeID.copy(h5py.h5t.C_S1)
dt.set_size(7)
dt.set_strpad(h5py.h5t.STR_NULLTERM)
return dt


def write_h5_null_string_att(loc_id, name, s):
"""Write a NULL terminated string attribute at loc_id."""
dt = make_h5_null_string(length=7)
name = bytes(name.encode('ascii'))
s = bytes(s.encode('ascii'))
at = h5py.h5a.create(loc_id, name, dt, h5py.h5s.create(h5py.h5s.SCALAR))
at.write(np.array(s, dtype=f'|S{len(s)+1}'))


@pytest.fixture(scope="session")
def gerb_l2_hr_h5_dummy_file(tmp_path_factory):
"""Create a dummy HDF5 file for the GERB L2 HR product."""
filename = tmp_path_factory.mktemp("data") / FNAME
Comment on lines +46 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code quality checkers are a little upset with how big this function is. How much of this is used in the actual reading code? Is this a complete copy of the files structure? If it isn't all used you could remove it from the test. A lot of it also looks like it could go into a function (copy C_S1, set size, create variable, write, etc). Maybe even a for loop. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this is code generated by a routine that will duplicate some instructions. A lot actually. And very little of the file content is currently used in the code. Basically, the reader loads the most used data and makes sure that the geolocation is correct.


with h5py.File(filename, 'w') as fid:
fid.create_group('/Angles')
fid['/Angles/Relative Azimuth'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Angles/Relative Azimuth'].attrs['Quantisation Factor'] = np.array(0.1, dtype='float64')
fid['/Angles/Solar Zenith'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Angles/Solar Zenith'].attrs['Quantisation Factor'] = np.array(0.1, dtype='float64')
write_h5_null_string_att(fid['/Angles/Relative Azimuth'].id, 'Unit', 'Degree')
fid['/Angles/Viewing Azimuth'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Angles/Viewing Azimuth'].attrs['Quantisation Factor'] = np.array(0.1, dtype='float64')
write_h5_null_string_att(fid['/Angles/Viewing Azimuth'].id, 'Unit', 'Degree')
fid['/Angles/Viewing Zenith'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Angles/Viewing Zenith'].attrs['Quantisation Factor'] = np.array(0.1, dtype='float64')
write_h5_null_string_att(fid['/Angles/Viewing Zenith'].id, 'Unit', 'Degree')
fid.create_group('/GERB')
dt = h5py.h5t.TypeID.copy(h5py.h5t.C_S1)
dt.set_size(3)
dt.set_strpad(h5py.h5t.STR_NULLTERM)
write_h5_null_string_att(fid['/GERB'].id, 'Instrument Identifier', 'G4')
fid.create_group('/GGSPS')
fid['/GGSPS'].attrs['L1.5 NANRG Product Version'] = np.array(-1, dtype='int32')
fid.create_group('/Geolocation')
write_h5_null_string_att(fid['/Geolocation'].id, 'Geolocation File Name',
'G4_SEV4_L20_HR_GEO_20180111_181500_V010.hdf')
fid['/Geolocation'].attrs['Nominal Satellite Longitude (degrees)'] = np.array(0.0, dtype='float64')
fid.create_group('/Imager')
fid['/Imager'].attrs['Instrument Identifier'] = np.array(4, dtype='int32')
write_h5_null_string_att(fid['/Imager'].id, 'Type', 'SEVIRI')
fid.create_group('/RMIB')
fid.create_group('/Radiometry')
fid['/Radiometry'].attrs['SEVIRI Radiance Definition Flag'] = np.array(2, dtype='int32')
fid['/Radiometry/A Values (per GERB detector cell)'] = np.ones(shape=(256,), dtype=np.dtype('>f8'))
fid['/Radiometry/C Values (per GERB detector cell)'] = np.ones(shape=(256,), dtype=np.dtype('>f8'))
fid['/Radiometry/Longwave Correction'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Longwave Correction'].attrs['Offset'] = np.array(1.0, dtype='float64')
fid['/Radiometry/Longwave Correction'].attrs['Quantisation Factor'] = np.array(0.005, dtype='float64')
fid['/Radiometry/Shortwave Correction'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Shortwave Correction'].attrs['Offset'] = np.array(1.0, dtype='float64')
fid['/Radiometry/Shortwave Correction'].attrs['Quantisation Factor'] = np.array(0.005, dtype='float64')
fid['/Radiometry/Solar Flux'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Solar Flux'].attrs['Quantisation Factor'] = np.array(0.25, dtype='float64')
write_h5_null_string_att(fid['/Radiometry/Solar Flux'].id, 'Unit', 'Watt per square meter')
fid['/Radiometry/Solar Radiance'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Solar Radiance'].attrs['Quantisation Factor'] = np.array(0.05, dtype='float64')
write_h5_null_string_att(fid['/Radiometry/Solar Radiance'].id, 'Unit', 'Watt per square meter per steradian')
fid['/Radiometry/Thermal Flux'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Thermal Flux'].attrs['Quantisation Factor'] = np.array(0.25, dtype='float64')
write_h5_null_string_att(fid['/Radiometry/Thermal Flux'].id, 'Unit', 'Watt per square meter')
fid['/Radiometry/Thermal Radiance'] = np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Radiometry/Thermal Radiance'].attrs['Quantisation Factor'] = np.array(0.05, dtype='float64')
write_h5_null_string_att(fid['/Radiometry/Thermal Radiance'].id, 'Unit', 'Watt per square meter per steradian')
fid.create_group('/Scene Identification')
write_h5_null_string_att(fid['/Scene Identification'].id,
'Solar Angular Dependency Models Set Version', 'CERES_TRMM.1')
write_h5_null_string_att(fid['/Scene Identification'].id,
'Thermal Angular Dependency Models Set Version', 'RMIB.3')
fid['/Scene Identification/Cloud Cover'] = np.ones(shape=(1237, 1237), dtype=np.dtype('uint8'))
fid['/Scene Identification/Cloud Cover'].attrs['Quantisation Factor'] = np.array(0.01, dtype='float64')
write_h5_null_string_att(fid['/Scene Identification/Cloud Cover'].id, 'Unit', 'Percent')
fid['/Scene Identification/Cloud Optical Depth (logarithm)'] = \
np.ones(shape=(1237, 1237), dtype=np.dtype('>i2'))
fid['/Scene Identification/Cloud Optical Depth (logarithm)'].attrs['Quantisation Factor'] = \
np.array(0.00025, dtype='float64')
fid['/Scene Identification/Cloud Phase'] = np.ones(shape=(1237, 1237), dtype=np.dtype('uint8'))
fid['/Scene Identification/Cloud Phase'].attrs['Quantisation Factor'] = np.array(0.01, dtype='float64')
write_h5_null_string_att(fid['/Scene Identification/Cloud Phase'].id, 'Unit',
'Percent (Water=0%,Mixed,Ice=100%)')
fid.create_group('/Times')
fid['/Times/Time (per row)'] = np.ones(shape=(1237,), dtype=np.dtype('|S22'))

return filename


@pytest.mark.parametrize("name", ["Solar Flux", "Thermal Flux", "Solar Radiance", "Thermal Radiance"])
def test_dataset_load(gerb_l2_hr_h5_dummy_file, name):
"""Test loading the solar flux component."""
scene = Scene(reader='gerb_l2_hr_h5', filenames=[gerb_l2_hr_h5_dummy_file])
scene.load([name])
assert scene[name].shape == (1237, 1237)
assert np.nanmax((scene[name].to_numpy().flatten() - 0.25)) < 1e-6
Loading