Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #678 support shapely in load collection spatial extent #682

Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
4db1718
issue #678 support shapely in load collection spatial extent
ElienVandermaesenVITO Dec 10, 2024
146f35e
issue #678 support shapely in load collection spatial extent
ElienVandermaesenVITO Dec 10, 2024
ef50a25
issue #678 support shapely in load collection spatial extent
ElienVandermaesenVITO Dec 10, 2024
8cadb84
issue #678 support shapely and local path in load collection spatial …
ElienVandermaesenVITO Dec 13, 2024
d277011
issue #678 support shapely and local path in load collection spatial …
ElienVandermaesenVITO Dec 16, 2024
81431c2
issue #678 support shapely and local path in load collection, make ch…
ElienVandermaesenVITO Dec 19, 2024
a64ce16
issue #678 correct valid types of geojson types in filter_spatial
ElienVandermaesenVITO Jan 2, 2025
a6c6533
issue #693 solve merge conflicts
ElienVandermaesenVITO Jan 8, 2025
db6bbb8
issue #693 solve merge conflicts
ElienVandermaesenVITO Jan 8, 2025
e9ec44d
issue #678 improve documentation
ElienVandermaesenVITO Jan 9, 2025
d83b3c8
Merge remote-tracking branch 'origin/master' into issue678-load_colle…
soxofaan Jan 16, 2025
ac11868
Issue #678/#682 further finetuning
soxofaan Jan 16, 2025
adf4f4c
fixup! Issue #678/#682 further finetuning
soxofaan Jan 16, 2025
a6f1cf9
fixup! fixup! Issue #678/#682 further finetuning
soxofaan Jan 16, 2025
45f0215
Issue #678/#682 more test coverage of `spatial_extent` handling
soxofaan Jan 17, 2025
8a3ba03
fixup! Issue #678/#682 more test coverage of `spatial_extent` handling
soxofaan Jan 17, 2025
0b486a4
Issue #678/#682 more tests for load_stac spatial_extent handling
soxofaan Jan 17, 2025
1cd1cd2
Issue #678/#682 finetune TestDataCube tests
soxofaan Jan 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- Automatically use `load_url` when providing a URL as geometries to `DataCube.aggregate_spatial()`, `DataCube.mask_polygon()`, etc. ([#104](https://github.com/Open-EO/openeo-python-client/issues/104), [#457](https://github.com/Open-EO/openeo-python-client/issues/457))
- Argument `spatial_extent` in `load_collection` supports type `shapely` and loading geometry from a local path.
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved

### Changed

Expand Down
9 changes: 7 additions & 2 deletions openeo/rest/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -1230,7 +1230,7 @@ def datacube_from_json(self, src: Union[str, Path], parameters: Optional[dict] =
def load_collection(
self,
collection_id: Union[str, Parameter],
spatial_extent: Union[Dict[str, float], Parameter, None] = None,
spatial_extent: Union[Dict[str, float], Parameter, shapely.geometry.base.BaseGeometry, None] = None,
temporal_extent: Union[Sequence[InputDate], Parameter, str, None] = None,
bands: Union[None, List[str], Parameter] = None,
properties: Union[
Expand All @@ -1243,7 +1243,12 @@ def load_collection(
Load a DataCube by collection id.

:param collection_id: image collection identifier
:param spatial_extent: limit data to specified bounding box or polygons
:param spatial_extent: limit data to specified bounding box or polygons. Can be provided in different ways:
- a shapely geometry
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
- a GeoJSON-style dictionary,
- a path (:py:class:`str` or :py:class:`~pathlib.Path`) to a local, client-side GeoJSON file,
which will be loaded automatically to get the geometries as GeoJSON construct.
- a :py:class:`~openeo.api.process.Parameter` instance.
:param temporal_extent: limit data to specified temporal interval.
Typically, just a two-item list or tuple containing start and end date.
See :ref:`filtering-on-temporal-extent-section` for more details on temporal extent handling and shorthand notation.
Expand Down
166 changes: 87 additions & 79 deletions openeo/rest/datacube.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ def load_collection(
cls,
collection_id: Union[str, Parameter],
connection: Optional[Connection] = None,
spatial_extent: Union[Dict[str, float], Parameter, None] = None,
spatial_extent: Union[Dict[str, float], Parameter, shapely.geometry.base.BaseGeometry, None] = None,
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
temporal_extent: Union[Sequence[InputDate], Parameter, str, None] = None,
bands: Union[None, List[str], Parameter] = None,
fetch_metadata: bool = True,
Expand All @@ -158,7 +158,12 @@ def load_collection(
:param collection_id: image collection identifier
:param connection: The backend connection to use.
Can be ``None`` to work without connection and collection metadata.
:param spatial_extent: limit data to specified bounding box or polygons
:param spatial_extent: limit data to specified bounding box or polygons. Can be provided in different ways:
- a shapely geometry
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
- a GeoJSON-style dictionary,
- a path (:py:class:`str` or :py:class:`~pathlib.Path`) to a local, client-side GeoJSON file,
which will be loaded automatically to get the geometries as GeoJSON construct.
- a :py:class:`~openeo.api.process.Parameter` instance.
:param temporal_extent: limit data to specified temporal interval.
Typically, just a two-item list or tuple containing start and end date.
See :ref:`filtering-on-temporal-extent-section` for more details on temporal extent handling and shorthand notation.
Expand Down Expand Up @@ -187,9 +192,14 @@ def load_collection(
"Unexpected parameterized `spatial_extent` in `load_collection`:"
f" expected schema with type 'object' but got {spatial_extent.schema!r}."
)
valid_geojson_types = [
"Polygon", "MultiPolygon", "Feature", "FeatureCollection"
]
if spatial_extent and not (isinstance(spatial_extent, dict) and spatial_extent.keys() & {"west", "east", "north", "south"}):
spatial_extent = _get_geometry_argument(argument=spatial_extent,valid_geojson_types=valid_geojson_types,connection=connection)
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved

arguments = {
'id': collection_id,
# TODO: spatial_extent could also be a "geojson" subtype object, so we might want to allow (and convert) shapely shapes as well here.
'spatial_extent': spatial_extent,
'temporal_extent': temporal_extent,
}
Expand Down Expand Up @@ -628,10 +638,9 @@ def filter_spatial(
(which will be loaded client-side to get the geometries as GeoJSON construct).
"""
valid_geojson_types = [
"Point", "MultiPoint", "LineString", "MultiLineString",
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
"Polygon", "MultiPolygon", "GeometryCollection", "FeatureCollection"
]
geometries = self._get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, crs=None)
geometries = _get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, connection=self.connection, crs=None)
return self.process(
process_id='filter_spatial',
arguments={
Expand Down Expand Up @@ -1058,75 +1067,6 @@ def _merge_operator_binary_cubes(
}
))

def _get_geometry_argument(
self,
argument: Union[
shapely.geometry.base.BaseGeometry,
dict,
str,
pathlib.Path,
Parameter,
_FromNodeMixin,
],
valid_geojson_types: List[str],
crs: Optional[str] = None,
) -> Union[dict, Parameter, PGNode]:
"""
Convert input to a geometry as "geojson" subtype object or vectorcube.

:param crs: value that encodes a coordinate reference system.
See :py:func:`openeo.util.normalize_crs` for more details about additional normalization that is applied to this argument.
"""
if isinstance(argument, Parameter):
return argument
elif isinstance(argument, _FromNodeMixin):
return argument.from_node()

if isinstance(argument, str) and re.match(r"^https?://", argument, flags=re.I):
# Geometry provided as URL: load with `load_url` (with best-effort format guess)
url = urllib.parse.urlparse(argument)
suffix = pathlib.Path(url.path.lower()).suffix
format = {
".json": "GeoJSON",
".geojson": "GeoJSON",
".pq": "Parquet",
".parquet": "Parquet",
".geoparquet": "Parquet",
}.get(suffix, suffix.split(".")[-1])
return self.connection.load_url(url=argument, format=format)

if (
isinstance(argument, (str, pathlib.Path))
and pathlib.Path(argument).is_file()
and pathlib.Path(argument).suffix.lower() in [".json", ".geojson"]
):
geometry = load_json(argument)
elif isinstance(argument, shapely.geometry.base.BaseGeometry):
geometry = mapping(argument)
elif isinstance(argument, dict):
geometry = argument
else:
raise OpenEoClientException(f"Invalid geometry argument: {argument!r}")

if geometry.get("type") not in valid_geojson_types:
raise OpenEoClientException("Invalid geometry type {t!r}, must be one of {s}".format(
t=geometry.get("type"), s=valid_geojson_types
))
if crs:
# TODO: don't warn when the crs is Lon-Lat like EPSG:4326?
warnings.warn(f"Geometry with non-Lon-Lat CRS {crs!r} is only supported by specific back-ends.")
# TODO #204 alternative for non-standard CRS in GeoJSON object?
epsg_code = normalize_crs(crs)
if epsg_code is not None:
# proj did recognize the CRS
crs_name = f"EPSG:{epsg_code}"
else:
# proj did not recognise this CRS
warnings.warn(f"non-Lon-Lat CRS {crs!r} is not known to the proj library and might not be supported.")
crs_name = crs
geometry["crs"] = {"type": "name", "properties": {"name": crs_name}}
return geometry

@openeo_process
def aggregate_spatial(
self,
Expand Down Expand Up @@ -1198,7 +1138,7 @@ def aggregate_spatial(
"Point", "MultiPoint", "LineString", "MultiLineString",
"Polygon", "MultiPolygon", "GeometryCollection", "Feature", "FeatureCollection"
]
geometries = self._get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, crs=crs)
geometries = _get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, connection= self.connection, crs=crs)
reducer = build_child_callback(reducer, parent_parameters=["data"])
return VectorCube(
graph=self._build_pgnode(
Expand Down Expand Up @@ -1478,8 +1418,8 @@ def chunk_polygon(
"Feature",
"FeatureCollection",
]
chunks = self._get_geometry_argument(
chunks, valid_geojson_types=valid_geojson_types
chunks = _get_geometry_argument(
chunks, valid_geojson_types=valid_geojson_types, connection=self.connection
)
mask_value = float(mask_value) if mask_value is not None else None
return self.process(
Expand Down Expand Up @@ -1568,7 +1508,7 @@ def apply_polygon(

process = build_child_callback(process, parent_parameters=["data"], connection=self.connection)
valid_geojson_types = ["Polygon", "MultiPolygon", "Feature", "FeatureCollection"]
geometries = self._get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types)
geometries = _get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, connection=self.connection)
mask_value = float(mask_value) if mask_value is not None else None
return self.process(
process_id="apply_polygon",
Expand Down Expand Up @@ -2056,7 +1996,7 @@ def mask_polygon(
(which will be loaded client-side to get the geometries as GeoJSON construct).
"""
valid_geojson_types = ["Polygon", "MultiPolygon", "GeometryCollection", "Feature", "FeatureCollection"]
mask = self._get_geometry_argument(mask, valid_geojson_types=valid_geojson_types, crs=srs)
mask = _get_geometry_argument(mask, valid_geojson_types=valid_geojson_types, connection=self.connection, crs=srs)
return self.process(
process_id="mask_polygon",
arguments=dict_no_none(
Expand Down Expand Up @@ -2860,3 +2800,71 @@ def unflatten_dimension(self, dimension: str, target_dimensions: List[str], labe
label_separator=label_separator,
),
)
def _get_geometry_argument(
ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
argument: Union[
shapely.geometry.base.BaseGeometry,
dict,
str,
pathlib.Path,
Parameter,
_FromNodeMixin,
],
valid_geojson_types: List[str],
connection: Connection = None,
crs: Optional[str] = None,
) -> Union[dict, Parameter, PGNode]:
"""
Convert input to a geometry as "geojson" subtype object or vectorcube.

:param crs: value that encodes a coordinate reference system.
See :py:func:`openeo.util.normalize_crs` for more details about additional normalization that is applied to this argument.
"""
if isinstance(argument, Parameter):
return argument
elif isinstance(argument, _FromNodeMixin):
return argument.from_node()

if isinstance(argument, str) and re.match(r"^https?://", argument, flags=re.I):
# Geometry provided as URL: load with `load_url` (with best-effort format guess)
url = urllib.parse.urlparse(argument)
suffix = pathlib.Path(url.path.lower()).suffix
format = {
".json": "GeoJSON",
".geojson": "GeoJSON",
".pq": "Parquet",
".parquet": "Parquet",
".geoparquet": "Parquet",
}.get(suffix, suffix.split(".")[-1])
return connection.load_url(url=argument, format=format)
#
if (
isinstance(argument, (str, pathlib.Path))
and pathlib.Path(argument).is_file()
and pathlib.Path(argument).suffix.lower() in [".json", ".geojson"]
):
geometry = load_json(argument)
elif isinstance(argument, shapely.geometry.base.BaseGeometry):
geometry = mapping(argument)
elif isinstance(argument, dict):
geometry = argument
else:
raise OpenEoClientException(f"Invalid geometry argument: {argument!r}")

if geometry.get("type") not in valid_geojson_types:
raise OpenEoClientException("Invalid geometry type {t!r}, must be one of {s}".format(
t=geometry.get("type"), s=valid_geojson_types
))
if crs:
# TODO: don't warn when the crs is Lon-Lat like EPSG:4326?
warnings.warn(f"Geometry with non-Lon-Lat CRS {crs!r} is only supported by specific back-ends.")
# TODO #204 alternative for non-standard CRS in GeoJSON object?
epsg_code = normalize_crs(crs)
if epsg_code is not None:
# proj did recognize the CRS
crs_name = f"EPSG:{epsg_code}"
else:
# proj did not recognise this CRS
warnings.warn(f"non-Lon-Lat CRS {crs!r} is not known to the proj library and might not be supported.")
crs_name = crs
geometry["crs"] = {"type": "name", "properties": {"name": crs_name}}
return geometry
27 changes: 27 additions & 0 deletions tests/rest/datacube/test_datacube.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,33 @@ def test_load_collection_connectionless_temporal_extent_shortcut(self):
}
}

def test_load_collection_connectionless_shapely_spatial_extent(self):
polygon = shapely.Polygon(((0.0,1.0),(2.0,1.0),(3.0,2.0),(1.5,0.0),(0.0,1.0)))
cube = DataCube.load_collection("T3", spatial_extent=polygon)
assert cube.flat_graph() == {
"loadcollection1": {
"arguments": {"id": "T3", "spatial_extent":
{'coordinates': (((0.0,1.0),(2.0,1.0),(3.0,2.0),(1.5,0.0),(0.0,1.0)),),'type': 'Polygon'},
"temporal_extent": None},
"process_id": "load_collection",
"result": True,
}
}

ElienVandermaesenVITO marked this conversation as resolved.
Show resolved Hide resolved
@pytest.mark.parametrize("path_factory", [str, pathlib.Path])
def test_load_collection_connectionless_local_path_spatial_extent(self, path_factory, test_data):
path = path_factory(test_data.get_path("geojson/polygon02.json"))
cube = DataCube.load_collection("T3", spatial_extent=path)
assert cube.flat_graph() == {
"loadcollection1": {
"arguments": {"id": "T3", "spatial_extent":
{"type": "Polygon", "coordinates": [[[3, 50], [4, 50], [4, 51], [3, 50]]]},
"temporal_extent": None},
"process_id": "load_collection",
"result": True,
}
}

def test_load_collection_connectionless_save_result(self):
cube = DataCube.load_collection("T3").save_result(format="GTiff")
assert cube.flat_graph() == {
Expand Down
Loading