-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: nodata handling #162 #163
Conversation
- adding new types - `MaybeAutoNodata` -- `None|int|float|str|"auto"` - `Nodata` is now `None|float|int` - `MaybeNodata` is now `None|float|int|str - "auto" replaces what used to be `None` - `None` now means "no nodata value" - `resolve_nodata()` is used to handle nodata options consistently across the library - default nodata for float is `nan` - fixes in reproject/overview generation for float data. It is assumed to have `nan` values, and GDAL needs `nodata=nan` to handle it correctly.
58b828c
to
fc2ea18
Compare
Thanks for the swift action on this Kirill - I will review tomorrow. |
- bump versions for netlify - supply token to codecov
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #163 +/- ##
===========================================
+ Coverage 95.26% 95.48% +0.21%
===========================================
Files 31 31
Lines 5323 5489 +166
===========================================
+ Hits 5071 5241 +170
+ Misses 252 248 -4 ☔ View full report in Codecov by Sentry. |
483fd09
to
b9aa246
Compare
Thanks heaps for this @Kirill888 - the test cases appear to match the functionality I was hoping for nicely, but I'll do a "user" test today and verify that TIFFs exported using the updated code work as intended (e.g. in ESRI etc). |
@robbibt thanks, BTW pleas use |
I think this is working perfectly. For example, we have a dataset with an Xarray
This gets written out with a GeoTIFF nodata flag
However, if we write out data without a Xarray
I can though return a true missing nodata value like this:
|
FillValue = Union[float, int] | ||
Nodata = Union[float, int, None] | ||
MaybeNodata = Union[float, int, str, None] | ||
MaybeAutoNodata = Union[float, int, str, None, Literal["auto"]] | ||
T = TypeVar("T") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaybeNodata = Nodata | str
seems unexpected as Maybe
usually means or None
. I understand that str might be necessary for cases where nodata in ('nan', 'NAN', 'NaN')
but I'm not sure why this gets a "Maybe" label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, there's a conversion function in math.py
I still think the naming here is confusing but I don't have a good suggestion for a replacement. Maybe ApiNodata
? As in the nodata type the user sees in the API (as opposed the internal nodata value passed to GDAL)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's more of a compatibility issue, before this Nodata
could not be None
, but it could be a str
, so there was a MaybeNodata
also. I guess we could delete it assuming nobody was using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some of the naming is confusing, but the approach is sound.
FillValue = Union[float, int] | ||
Nodata = Union[float, int, None] | ||
MaybeNodata = Union[float, int, str, None] | ||
MaybeAutoNodata = Union[float, int, str, None, Literal["auto"]] | ||
T = TypeVar("T") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, there's a conversion function in math.py
I still think the naming here is confusing but I don't have a good suggestion for a replacement. Maybe ApiNodata
? As in the nodata type the user sees in the API (as opposed the internal nodata value passed to GDAL)?
I do want to do a quick test of the COG overviews stuff, so will approve this as soon as I've finished that. |
I also find "MaybeAutoNodata" a bit confusing - took me a while to work out what it meant. |
What about |
Yes |
now that `Nodata` is allowed to be `None`, 'MaybeNodata' is now confusing, removing it. Let's assume that it never was a part of the `odc.geo` API.
Updated,
|
MaybeAutoNodata
SomeNodata
--None|int|float|str|"auto"
Nodata
is nowNone|float|int
MaybeNodata
is nowhave been removedNone|float|int|str
None
None
now means "no nodata value"resolve_nodata()
is used to handle nodata options consistently across the library.odc.nodata
is used to extractnodata
from xarraysnan
values, and GDAL needsnodata=nan
to handle it correctly.nodata types explained
MaybeAutoNodata
SomeNodata
should be used in APIs that allow configuring nodata and nodata overridesNodata
is a type suitable to pass to GDAL{src,dst)_nodata
it can beNone
which means expect no missing values in the data.FillValue
is a type suitable fornp.full
and is resolved fromNodata
+dtype
NaN
for not-configured floating point images0
for not-configured integer imagesnodata
if set (NaN
is fine here too)