-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load_user_data (or similar) #83
Comments
👍 to using the existing endpoint |
Since it returns an image collection, maybe it makes sense to name it |
It doesn't return an image collection, but a data cube. load_user_data says where to load it from (user workspace), which is consistent with load_collection (loads data made available with the collections endpoints) and load_result (load a job result). |
Maybe a general {
"process_id": "load_data",
"arguments": {
"format": "GTiff",
"source": "S3",
"options": {
"uri": "s3://bucket/prefix",
"more_options": "here"
}
}
} |
I feel that this is a bit too generic and it will be hard to document all the options. Wouldn't it bet easier to use if we would define more processes for specific use cases? For example, load_s3_data and load_gcs_data or so? |
We have a use case where we want to load geotiffs from disk so I'd like to add something like this: {
"process_id":"load_disk_data",
"arguments":{
"format":"GTiff",
"glob_pattern":"/data/MTDA/CGS_S2/CGS_S2_FAPAR/2019/04/24/*/*/10M/*_FAPAR_10M_V102.tif",
"options":{
"date_regex":"_(\\d{4})(\\d{2})(\\d{2})T"
}
}
} |
@bossie Go ahead and define such a function. I don't think this is a function for the process catalogue though as usually users won't know anything about the internal structure of your disks?! I think this function would be a good start for a list of proprietary extensions for the processes we could list somewhere here. |
Hi Matthias, |
Go ahead and define such a function. I don't know what you need therefore it is better if you make a proposal we can discuss. The process looks relatively complicated with regex etc and therefore I'm not sure whether that might be too much for the "core". Also, I'm not sure whether this process is limited to your driver or whether other back-ends would also make use of it. I think we should discuss this process separately. In general, I think we should not discuss all kinds of loading functions in this single issue, but make an issue for each of them. Otherwise it gets complicated to follow and manage. |
Telco conclusion: wait with standardized definition until other backends (want to) implement this. |
Thanks for the conclusion, @jdries. I'm not sure you discussed what the issue was originally about. load_user_data (but we may choose another name, maybe For the other processes to import from non-API sources: I would clearly separate and define functions such as (names to be discussed): import_s3 (or load_s3), import_nfs, import_gcs etc. whenever required. For this I'd propose to open separate issues or PRs for discussion. Edit: see #105 |
…pecify the data cube loading/storing mechanism in /file_formats, see Open-EO/openeo-processes#83
Pr has been merged. |
A new process was proposed on the 3rd year planning: load_user_data (or similar).
Should load user-uploaded data and convert it into a data cube, similar to load_collection and load_result.
We need to check how to communicate to a user what file formats are allowed to be uploaded. (Change /output_formats to /file_formats and add a list of supported formats for loading as data cube?)
The text was updated successfully, but these errors were encountered: