Add Great African Food Company Crop Type Tanzania Dataset #511

nilsleh · 2022-04-20T19:32:59Z

This PR adds the Crop Type Tanzania dataset from Radiant MLHub.

It features time series data with polygon crop type annotations as segmentation masks. I have implemented it as a GeoDataset to allow for train/val/test splits based on geographical location and am looking to do that with other datasets of this type (also converting CV4A_Kenya_Crop_Type dataset to a GeoDataset). Following the TODO in cv4a_kenya_crop_type.py this implementation is populating the rtree index by using stac.json files. The __getitem__ method returns a tensor with dimensions time x num_bands x height x width. This dataset has both rasters as input, as well as vector annotations, so sort of a hybrid between RasterDataset and VectorDataset.

Dataset Features:

392 annotations with 6 different crop type classes for 44 different labeled areas that each have a variant amount of time series inputs in form of Sentinel 2 imagery

Dataset Format:

separate sentinel 2 bands as tif file as well as a cloud probability layer (images in epsg 32736)
stac.json files for each input image tile (bboxes in epsg 4326)
geojson files with polygon annotation and label (polygon coordinates in epsg 32736)
stac.json for labels

Issues:

I am not creating the correct dummy data, something is off with the bounds, and therefore tests are
here aren't many annotations and the ones that are there are all very small (see below for example) making me question whether I mess up the indexing when creating a segmentation mask from the polygon annotations
there is also another design choice regarding the included datetime in stac.json files: if this datetime is used to populate the index, then RandomGeoSampler will also sample time instances and a given bounding box query will not return all timesteps for a given geographical XY location, as it is maybe expected.

Example

:

adamjstewart · 2022-04-20T19:48:52Z

and am looking to do that with other datasets of this type (also converting CV4A_Kenya_Crop_Type dataset to a GeoDataset)

❤️

adamjstewart · 2022-07-02T15:32:03Z

Superseded by #512

nilsleh added 2 commits April 20, 2022 20:13

first version

d0e94cc

try to fix tests

824da54

github-actions bot added datasets Geospatial or benchmark datasets testing Continuous integration testing labels Apr 20, 2022

adamjstewart added this to the 0.3.0 milestone Apr 20, 2022

improve test coverage

c9988d3

nilsleh mentioned this pull request Apr 21, 2022

Radiant MLHub Crop Type Datasets #512

Closed

4 tasks

adamjstewart closed this Jul 2, 2022

adamjstewart removed this from the 0.3.0 milestone Jul 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Great African Food Company Crop Type Tanzania Dataset #511

Add Great African Food Company Crop Type Tanzania Dataset #511

nilsleh commented Apr 20, 2022 •

edited

Loading

adamjstewart commented Apr 20, 2022

adamjstewart commented Jul 2, 2022

Add Great African Food Company Crop Type Tanzania Dataset #511

Add Great African Food Company Crop Type Tanzania Dataset #511

Conversation

nilsleh commented Apr 20, 2022 • edited Loading

adamjstewart commented Apr 20, 2022

adamjstewart commented Jul 2, 2022

nilsleh commented Apr 20, 2022 •

edited

Loading