-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spatial Index improvements (index-per-graph + kryo) #3026
Labels
enhancement
Incrementally add new feature
Comments
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 21, 2025
4 tasks
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 25, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 26, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 26, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 26, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 26, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 26, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 27, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 27, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Feb 28, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Mar 1, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Mar 2, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Mar 2, 2025
Aklakan
pushed a commit
to AKSW/jena
that referenced
this issue
Mar 2, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Version
5.4.0-SNAPSHOT
Feature
This proposal is to enhance the spatial index with support for index-per-graph as well as to improve its serialization using kryo - via Apache Sedona's kryo/jts implementation.
This is an incremental improvement of the existing JTS-based in-memory implementation - its not a complete overhaul such as a disk-based incrementally updated transaction-aware R-tree (if someone contributed that then this issue's PR could be discarded 😄 ).
The impact of this work have been evaluated and presented at the GeoLD workshop last year proceedings:
Simon Bin, Claus Stadler, Lorenz Bühmann, and Michael Martin
Getting practical with GeoSPARQL and Apache Jena
Slides
The essence is presented on the following slides:
Using an index per graph (unsurprisingly) boosts the performance when multiple graphs have geometries and only a subset is queried (slide 15):
As for serialization performance (slide 16), while index building became a bit slower, this is outweighed by near-instant loading of the spatial index. The reason for the writing overhead is, that the index tree is now serialized as a tree - before, the items were written out as a flat list, and the tree had to be rebuilt from scratch on restart.
A new
geosparql:indexPerGraph
option (boolean) is added to thegeosparql:GeosparqlDataset
assembler.The implementation has been mainly done by @LorenzBuehmann - the writing and presentation is the work of @SimonBin - I supported in evaluation.
As for compatibility, I need to check for whether it is backward compatible but I think due to the change of the serializer, existing spatial indexes would have to be rebuilt.
For reference, a bit of related discussion has happened in #2645.
Are you interested in contributing a solution yourself?
Yes
The text was updated successfully, but these errors were encountered: