-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Radiant MLHub Crop Type Datasets #512
Conversation
After spending some time on this I have a more general question about these types of datasets. And since to my knowledge there is not yet a I am hereafter assuming that the desired behavior for such a time-series raster datasets is a The following outline different approaches and observations I have made:
Another observation is that not all labels range over the same time-horizon. So while some labels have lets say 40 corresponding images, others might have 70. Hence, consider the case when a bounding box from the sampler suggests a region that intersects with two or more such labels. What is the proper way of merging the varying time dimensions of rasters to yield one sample, in addition to merging individual bands of each of the samples like Maybe I am also thinking about this wrong or missing something. Either way, I would welcome suggestions/comments. |
Great points. Maybe move this #512 (comment) to a new issue. There's a lot to digest here, and I can see different use cases depending on time sensitivity which would require different indexing styles. E.g. time-sensitive flood mapping where you want to map 1 label mask: 1 time-slice, and landcover classifications where you could have 1 label mask: N time-slices (though land cover could change over longer periods of time). |
Yes I think #511 can be closed. But this PR still needs some work, since they are all time-series and additionally three of the datasets are veeery label sparse so not sure I want to add them. However, depending on the method we decide to handle time-series data, I will add the South Africa Crop Type dataset and then convert the CV4A_Crop_Type dataset to Geodatasets. |
#1840 is adding South Africa Crop Type Competition |
I think we can close this. Radiant MLHub doesn't even exist anymore. Most (not all) datasets were moved to Source Cooperative, but the file hierarchy and file formats are completely different, so most of these datasets would have to be rewritten from scratch anyway. |
This PR "superseeds" #511, because I found that Radiant ML Hub has multiple crop type datasets that follow almost the exact same format. Only the label.geojson files differ and hold the crop type label under different keys. This PR adds the following four crop type segmentation datasets under one abstract class:
Dataset Format:
These Issues still persist across datasets:
RandomGeoSampler
will also sample time instances and a given bounding box query will not return all timesteps for a given geographical XY location, as it is maybe expected.