Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create/Add geoparquet files #16

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

m-mohr
Copy link
Contributor

@m-mohr m-mohr commented Jul 3, 2023

This is more a proposal, rather than a "ready-to-merge" PR.

It generates the geoparquet files for all collections nicely and they can be read in QGIS.
Demo in STAC Browser: https://radiantearth.github.io/stac-browser/#/external/raw.githubusercontent.com/m-mohr/open-stac/geoparquet/stac/open-skysat-data/collection.json?.asset=asset-geoparquet-items

I assume we should remove the GH action and instead add the creation procedure to the Makefile. I'm not really a user of Makefiles so maybe @tschaub can solve this more easily/quickly? Thus this is still a draft.

@cholmes
Copy link
Member

cholmes commented Jul 26, 2023

I think it may be fine to just leave it as a GH action? The output looks like it goes to the repo, so then the actual files could just transfer over. I think Tim likes github actions too, and did the makefile since we didn't have this public yet.

One minor quibble - I don't think I like the .geoparquet suffix, since I think it should be able to just be treated as 'parquet'. Is it Tom's scripts that do that? I can raise the issue there to discuss.

@m-mohr
Copy link
Contributor Author

m-mohr commented Jul 26, 2023

Happy to change to .parquet. I thought .geoparquet is the way to go, but .parquet also works. It just needs an update in the CLI commands here, will update it.

The GH action thing is a bit tricky because it creates a commit to the same branch that the triggering commit came from. That is somewhat "dirty", I think. So it would be a bit cleaner if it would just be generated during the publish step, but I'd need to learn Makefile first. That is why I was thinking it could make sense if Tim does this in 5minutes while I may need an hour or so ;-)

@tschaub
Copy link
Member

tschaub commented Jul 26, 2023

If we want to make it easier for @m-mohr or others to make changes that affect the deployment, we can deploy this to GitHub Pages instead of www.planet.com. If it is deployed at www.planet.com then changes need to be made to the GitLab CI jobs (the GitHub CI jobs continue to make sense as a place to run things that verify that commits look good, but not to change things about the deployment).

@m-mohr
Copy link
Contributor Author

m-mohr commented Jul 26, 2023

Updated from .geoparquet to .parquet.

Another reason to create the geoparquet during the deployment: The self links are not absolute here, but I guess you want the self_link column also in the Planet Open Data. So running the geoparquet step after creating the absolute links would be good. Alternatively, I'd again go the route as we did with rapidai4eo and provide the base url to create the self links on the fly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants