-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3d1981a
commit c05e09e
Showing
1 changed file
with
103 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
--- | ||
title: 'Release 4.0: 🚄 Faster actor testing and modularized expressions' | ||
--- | ||
|
||
Earlier this year, [Comunica version 3.0 was released](/blog/2024-03-19-release_3_0/), | ||
which introduced better querying planning across data sources and several convenience features. | ||
Since then, we had several minor releases introducing multiple additions and performance improvements. | ||
In [release 3.2](/blog/2024-07-05-release_3_2/), we introduced better techniques for discovering performance bottleneck. | ||
This allowed us identify some low-level bottlenecks in the internals of Comunica which limited performance. | ||
Fixing these issues requires breaking changes to the internal API of Comunica, | ||
which is the main reason for this major update. | ||
As a result of these changes, we see **performance improvements of 20% to 40%**. | ||
Breaking changes are limited to internal Comunica APIs. | ||
This means that developers that make use of Comunica can have their cake _and_ eat it 🎂; | ||
no breaking changes _and_ gaining a performance boost. | ||
|
||
<!-- excerpt-end --> | ||
|
||
## 🪢 New `Actor.test` contract for better performance | ||
|
||
To determine which actors can answer a certain action, | ||
all Comunica actors expose a `test` method, which used to look something like this: | ||
|
||
```typescript | ||
class MyActor extends Actor { | ||
public async test(action: IAction): Promise<IActorTest> { | ||
if (conditionNotMet(action)) { | ||
throw new Error('This actor can not handle the action'); | ||
} | ||
return true; | ||
} | ||
} | ||
``` | ||
|
||
The problem with the above is that JavaScript engines such as V8 | ||
will eagerly build internal stacktraces when creating `Error` objects. | ||
Since Comunica has a large number of actors (240 at the time of writing), | ||
an average query execution can lead to a huge number of internal `Error` objects being created. | ||
According to our measurements, this produced a non-negligible performance overhead. | ||
|
||
As such, we refactored the contract of the `test` method to not rely on these `Error` objects anymore. | ||
Instead, `test` methods now make use of `TestResult` objects, | ||
which in practise look like this: | ||
|
||
```typescript | ||
class MyActor extends Actor { | ||
public async test(action: IAction): Promise<TestResult<IActorTest>> { | ||
if (conditionNotMet(action)) { | ||
return failTest('This actor can not handle the action'); | ||
} | ||
return passTestVoid(); | ||
} | ||
} | ||
``` | ||
|
||
For various benchmarks on in-memory triple stores, this change makes queries up to 20% faster. | ||
|
||
## 🧩 Modularization of expressions logic | ||
|
||
Thanks to [Jitse De Smet](https://jitsedesmet.be/)'s monumental effort, | ||
all expressions-related logic in Comunica is now fully modularized. | ||
Previously, the handling of filters and aggregates were all delegated to the singular `sparqlee` package. | ||
While this package did a great job of handling filters and aggregates, | ||
it lacked the modularity that existed for all other parts of query execution. | ||
For example, it was not possible to easily plug in your own actor to evaluate the `SUM` aggregator in a different way. | ||
|
||
With this release, the `sparqlee` has been split up into multiple buses and actors, | ||
which are responsible for term comparators, function, and aggregators. | ||
For this, we avoided any kind of performance degradation. | ||
|
||
Learn more about [expressions evaluation in our documentation](https://comunica.dev/docs/modify/advanced/expression-evaluator/). | ||
|
||
## 🚄 Performance improvements | ||
|
||
Besides the changes mentioned above, | ||
there are a number of smaller changes that have a positive impact on performance that are worth mentioning: | ||
|
||
- [Optimize Bindings merge logic](https://github.com/comunica/comunica/commit/20daf1761b1ec4c82357909a45fa4f84e2754080) | ||
- [Fix internal cardinalities being wrong for SPARQL endpoints with VoID](https://github.com/comunica/comunica/commit/075f5dde03354f0516398235146d9a93883e8b66) | ||
- [Hash Joins now always use 32-bit numbers](https://github.com/comunica/comunica/commit/c1474e4167f1978e144bee9b7663fcb2bafc21bf), which speeds up operations in the V8 engine. | ||
- [Hash joins use the faster Murmur3 hash method](https://github.com/comunica/comunica/commit/331865c773882be7e061a690833fc841ece27339) | ||
- [Only consider overlapping vars when testing undef in join actors](https://github.com/comunica/comunica/commit/5ddb64bb1df4b1179529f3e2ea0c05fb321dfce9) | ||
- [Refactor HTTP fetch and retry logic](https://github.com/comunica/comunica/commit/b37d7e008b77067f6c6249aadfc901baf42a6914), which leads to more stable query execution over servers that make use of rate limits | ||
|
||
## 🤝 Contributors | ||
|
||
This release has been made possible thanks to the help of the following contributors (in no particular order): | ||
|
||
- [Jonni Hanski](https://github.com/surilindur) | ||
- [Jitse De Smet](https://github.com/jitsedesmet) | ||
- [Ruben Eschauzier](https://github.com/RubenEschauzier) | ||
- [Karel Klíma](https://github.com/karelklima) | ||
- [Ieben Smessaert](https://github.com/smessie) | ||
- [Maarten Vandenbrande](https://github.com/maartyman) | ||
- [Bryan-Elliott Tam](https://github.com/constraintAutomaton) | ||
- [Jesse Wright](https://github.com/jeswr) | ||
- [Ruben Taelman](https://github.com/rubensworks/) | ||
|
||
## Full changelog | ||
|
||
While this blog post explained the primary changes in Comunica 4.x, | ||
there are actually many more smaller changes internally that will make your lives easier. | ||
If you want to learn more about these changes, check out the [full changelog](https://github.com/comunica/comunica/blob/master/CHANGELOG.md#v401---2024-10-15). |