Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optionally clip data in crude_stretch #90

Open
gerritholl opened this issue Oct 11, 2021 · 5 comments
Open

optionally clip data in crude_stretch #90

gerritholl opened this issue Oct 11, 2021 · 5 comments

Comments

@gerritholl
Copy link
Contributor

Feature Request

Is your feature request related to a problem? Please describe.

Users generating NinJoTIFF files with Satpy who wish to use the pyninjotiff clipping functionality currently have to define the boundaries of their data values in two places:

  • In their enhancements file (enhancements/generic.yaml) in the units to which the data are calibrated; for example, for solar infrared channels, this is in brightness temperature in units of Kelvin, such as:
      method: !!python/name:satpy.enhancements.stretch
      kwargs: {stretch: crude, min_stretch: 313.5, max_stretch: 186}
  • As keyword arguments to the NinJoTIFF writer, with keywords ch_min_measurement_unit and ch_max_measurement_unit, now in units such as interpreted by NinJo, which may be in °C. From a trollflow2.yaml example:
  physic_unit: C
  ch_min_measurement_unit: 40
  ch_max_measurement_unit: -87.5

This happens because stretch (implemented in trollimage.xrimage.XRImage.crude_stretch) stretches the data between the two values such that 0 maps to the indicated minimum and 1 maps to the indicated maximum; and then pyninjotiff clips the data (in pyninjotiff.ninjotiff._finalize) and rescales it making place for a fill value.

It is the responsibility of the user to make sure those two match, or values as interpreted by the client software (in this case NinJo) will be incorrect.

This is problematic, because:

  • the user needs to configure the limits twice, possibly in different units, which is redundant work and can be error-prone;
  • the ninjotiff writer is altering the data in a way that should not be the responsibility of the writer.

There is ongoing work to replace the ninjotiff writer by a ninjogeotiff writer (see pytroll/satpy#1839). This new writer provides an opportunity for a cleanup. As part of that cleanup, the two problems with the current approach can be resolved. This feature request is triggered by a need fo rthe ninjogeotiff writer.

Describe the solution you'd like

I would like that trollimage.xrimage.XRImage.crude_stretch optionally clips the data to the indicated range. This can be implemented by an additional boolean keyword argument clip, which would default to False (the current behaviour). This would allow the user to configure their limits only in one place and the satpy writer to focus on actual writing.

Describe any changes to existing user workflow

The status quo would remain the default, so this change should be fully backwards compatible.

Additional context

Perhaps the satpy writer could still take care of the clipping, but look up the clip limits in the enhancement history, which may contain the parameters to crude stretch. This feels wrong.

Perhaps the stretch-that-can-clip would be a different enhancement than the current crude-stretch, rather than the current enhancement with a new keyword argument. I think a keyword argument fits better.

Perhaps clipping should be a separate step from enhancing, and there should be two enhancements applied if users want clipping. I don't know how to apply two enhancements.

Perhaps there are other approaches that I'm not thinking of.

@djhoese
Copy link
Member

djhoese commented Oct 11, 2021

So on one hand I like this. I know I've done things like this Polar2Grid in the past and even gone as far as making it possible to only clip the minimum and not the maximum or vice versa. If this was implemented it might be nice to allow for that option.

That said, I don't think this clipping is actually necessary. The crude stretch is scaling the data to a 0 to 1 range. I believe the save methods then clip the data to the data range supported by the output format (ex uint8 -> 0 to 255).

def _scale_to_dtype(self, data, dtype, fill_value=None):
"""Scale provided data to dtype range assuming a 0-1 range.
Float input data is assumed to be normalized to a 0 to 1 range.
Integer input data is not scaled, only clipped. A float output
type is not scaled since both outputs and inputs are assumed to
be in the 0-1 range already.
"""
attrs = data.attrs.copy()
if np.issubdtype(dtype, np.integer):
if np.issubdtype(data, np.integer):
# preserve integer data type
data = data.clip(np.iinfo(dtype).min, np.iinfo(dtype).max)
else:
# scale float data (assumed to be 0 to 1) to full integer space
# leave room for fill value if needed
scale, offset = self._get_dtype_scale_offset(dtype, fill_value)
data = data.clip(0, 1) * scale + offset
attrs.setdefault('enhancement_history', list()).append({'scale': scale, 'offset': offset})
data = data.round()
data.attrs = attrs
return data

Based on this code this is true for anything doing a stretch (which would mean floating point numbers between 0 and 1), but would not effect integers which may or may not be desired.

Another consideration if I'm wrong: A separate clip method? Still duplicating limits in two enhancement operations but at least they are right next to each other.

@gerritholl
Copy link
Contributor Author

gerritholl commented Oct 11, 2021

There is one edge case that may mean you're wrong, or that at least would still need special treatment: the fill value. For example: if BTs range from 180 to 320, then crude_stretch(200, 300) will map this to range from -0.2 to 1.2. Without a fill value, finalize will map this to 0-255, clip small BTs to 0 and large BTs to 1 and all is fine. With a fill value of 0, finalize will map this from 1-255... and BTs in the range 180-200 may end up clipped to 0 (fill value) rather than 1 (lowest physical value).

@djhoese
Copy link
Member

djhoese commented Oct 11, 2021

I think the else: block of the code I linked to above handles that. It first clips data.clip(0, 1) then scales the data to the fill value range (in this case 1 to 255). So in your example the 180 to 200 data should have already been clipped before "space" was made for the fill value at 0. ...right?

@gerritholl
Copy link
Contributor Author

Ah yes, it would seem you are right. _scale_to_dtype does a bit more than I thought. Maybe that means the feature request I'm describing here is entirely redundant, as is the work currently done by pyninjotiff (which makes me wonder about the history here, but that's another question). I'll do some tests with the satpy ninjogeotiff writer I'm developing and will close this issue if there is indeed nothing that I need to change in trollimage.

@djhoese
Copy link
Member

djhoese commented Oct 11, 2021

Sounds good. The fill value handling in trollimage is relatively new (in the last year maybe).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants