Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying geographies (spatial index) #72

Open
benbovy opened this issue Nov 13, 2024 · 3 comments
Open

Querying geographies (spatial index) #72

benbovy opened this issue Nov 13, 2024 · 3 comments

Comments

@benbovy
Copy link
Owner

benbovy commented Nov 13, 2024

Similarly to shapely.STRTree, it would be nice if we could provide a way in spherely to create a spatial index of a (large) collection of geographies.

A few thoughts:

  • For a collection of single points, s2geometry provides a S2PointIndex. Actually, I'd be happy to move https://github.com/benbovy/pys2index here, although I think it is useful to also expose it via a simple API like scipy.spatial.cKDTree that is not based on simple features.

  • For a collection of arbitrary geometries, would it make sense to reuse S2ShapeIndex? Which itself can be viewed as a quad-tree... I don't think MutableS2ShapeIndex will be very helpful since IIUC it takes ownership of the S2Shape objects added to the index (EDIT: I misunderstood S2Polyline::Shape, S2Polygon::Shape, etc. that are wrapper classes that do not take ownership of the wrapped S2 objects), but what about EncodedS2ShapeIndex? (although still not ideal to duplicate information in the index). Here we want to retrieve (query) s2geography geographies instead of s2geometry shapes, maybe s2geometry's tagging system would make it possible? If we don't allow adding or removing geographies after creating the index, maybe we could use the indices of the array of geographies as tags? @paleolimbot do you think feat: Implement encoding and decoding for geography subclasses paleolimbot/s2geography#40 would help here?

@paleolimbot
Copy link

It's probably not very efficient, but there is the GeographyIndex:

https://github.com/paleolimbot/s2geography/blob/21400cfc864203db7bb536f37bdbed3682e4140f/src/s2geography/index.h#L18-L23

You can see it in action in the R package:

https://github.com/r-spatial/s2/blob/b495b0df53bffc7dc1ad3780fe8ac208e78cd1bf/src/s2-matrix.cpp#L19

For points I think you do just want the point index (although I don't know if it will be faster to loop once and create the GeographyIndex or loop twice to check if everything is, in fact a point).

Ideally we'd improve on the GeographyIndex by only considering the cell covering of the input (rather than every edge of the input). Then you can build the index using multiple threads or workers (multi thread the calculation of the covering part and pickling/shipping the cell unions between processes is probably cheap). I am not sure if there's something built-in for this, whether we can use the MutableS2ShapeIndex for this, or whether we would have to roll our own.

It would be nice to be able to serialize the index, and if we can somehow use the MutableS2ShapeIndex to do that we get the ability to encode it for free. Serializing the index is a massively beneficial feature (you can build the index once and ship it to a bunch of worker processes or put it in a file), plus S2's encoded shape index is sufficiently lazy that you could probably memory map it and query it without using much memory.

@benbovy
Copy link
Owner Author

benbovy commented Nov 14, 2024

Ah yes I forgot about s2geography::GeographyIndex, which is exactly what we need.

It would also be nice of s2geography::GeographyIndex could provide a high-level query API (e.g., by predicate or distance), using S2ContainsPointQuery, S2ClosestEdgeQuery, etc. under the hood. I.e., port things that you actually implemented in R s2.

@benbovy
Copy link
Owner Author

benbovy commented Nov 14, 2024

Opened paleolimbot/s2geography#45 as it probably makes more sense to continue the discussion about s2geography::GeographyIndex there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants