-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large geometry intersection check perf speedup #701
Conversation
Now we're talking: Old (red, naïve) vs new (blue, tree) (sorry for reversing the setup this time!) The naïve version has to potentially do ~860 million line-line intersection checks so 165 ms isn't bad, but the R*-tree mean is only 35.4 ms… |
Cool! Just as a reminder, @rmanoka did a somewhat related investigation of intersection perf here: It's a slightly different use case, but the findings might be relevant. In particular, I sort of expect the performance of R-Tree to be better than the naive approach as a.segments() * b.segements() scales. So if you did want to have a threshold to toggle between "naive" vs. "rtree" using a multiplicative, rather than an additive might make more sense.
|
Yeah, the multiplicative is definitely better. Next up, I need to figure out a way of getting the large polygon data into the benchmark; |
Maybe #566? (lol not to just dump my wishlist on you or anything...) |
Hmm as constructed |
Update: we're checking for line intersections in this case, since we're decomposing the polygons into lines. But that excludes the possibility of |
If / when #351 lands, the easier case ( |
853620f
to
332e949
Compare
geo_types::Line<T>: RTreeObject, | ||
{ | ||
fn intersects(&self, linestring: &LineString<T>) -> bool { | ||
if (self.exterior().0.len() + self.interiors().iter().map(|ls| ls.0.len()).sum::<usize>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (self.exterior().0.len() + self.interiors().iter().map(|ls| ls.0.len()).sum::<usize>())
#707 😄
let lines_a: Vec<_> = self | ||
.exterior() | ||
.lines() | ||
.chain(self.interiors().iter().flat_map(|ls| ls.lines())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the last coordinate of the exterior ring connect to the first coordinate of the first interior ring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, that's not what's happening here, nevermind!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meep meep #708
Now that #829 is merged, we might get comparable perf via something like |
|
Do you still have an example of the data you were using to see this kind of performance? I'd expect the R-Tree approach to be a constant factor slower, but better asymptotically — much better for big inputs and a little slower for small inputs. I am curious though as to what range of geometry sizes we can expect to see the tradeoffs. |
If you check out this branch and run the boolean ops benchmarks you should see the perf regression in the criterion output. I should also note that this was a quick check that I didn't dig into, so I may have messed something up. |
Above some threshold, it may be faster to load the geometry or geometries into an R*-star tree and query for a subset of intersection candidates.
Also use multiplicative threshold (TODO: what should it be?)
(I rebased against |
We have prepared geometries for this now! |
Above some threshold, it may be faster to load the geometry or geometries into an R*-star tree and query for a subset of intersection candidates.
This is a draft PR because:
I haven't benchmarked the tree-query logic in order to figure out even an approximate value for
MAX_NAIVE_SEGMENTS
I haven't figured out whether checking the number of segments in order to decide to switch to trees is an unacceptable perf hit
I agree to follow the project's code of conduct.
I added an entry to
CHANGES.md
if knowledge of this change could be valuable to users.