-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job upscaling: spatial splitting utilities #735
Comments
We now have 3 types of job splitters which we are looking into @VictorVerhaert, @jdries would you propose here to have it 'non-optimized' and fixed splitting per 20km tiles? Or should we offer eventually also alternative options for sentinel2 tile based splitting ect. |
Should we perhaps turn this into an epic? Because non-optimized would be the most basic thing to start with, and then we should indeed add things like UTM grid. (Overlapping S2 tiles is not a grid I typically recommend, so that would be more like an advanced option.) |
To get the correct scope of this issue: is it only for splitting bounding boxes (e.g. for inference) or also for geometries (e.g. for point extractions). The 3 existing job splitters in gfmap are to be used mainly for extractions I would say so maybe not that relevant for this issue @jdries do we know what the current maximum spatial extent currenty is given all the optimizations and parallelisations done for lcfm? perhaps running a job for a whole county is already possible and we could rather focus on testing it with setting the output tiling grid like the save_Result option for GTiffs to ensure we don't create too large COG's. |
@jdries perhaps good if we indeed create a couple of sub-tasks for this one. It would be good to have a clear sight on how we want users to 'ínteract' or 'experience' this feature. |
When using the job manager, users still need to somehow construct the GeoDataFrame that defines initial job splitting.
A typical use case is that users want to run a job over e.g. a full country, and don't want to know the details about tile grids.
There are 'well-known' tile grids that apply: UTM at different sizes for global processing, and LAEA for Europe.
The utility should focus on making UDP based upscaling as simple as possible, reducing the required input parameters to a minimum, while having optional parameters in case the user has preferences.
It is also an option for the UDP itself to formally indicate splitting options, like a preferred tile grid. For instance, if UDP author knows that included job options are optimized for 20km tiles, it makes sense for the job splitter to take this into account, allowing for a more predictable outcome.
openeo-python-client/docs/cookbook/job_manager.rst
Line 124 in c1589a8
Existing and similar code in aggregator:
https://github.com/Open-EO/openeo-aggregator/blob/master/src/openeo_aggregator/partitionedjobs/splitting.py
The text was updated successfully, but these errors were encountered: