-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve generate fragment compression #6651
Conversation
@dariuszkuc, please consider creating a changeset entry in |
✅ Docs preview has no changesThe preview was not built because there were no changes. Build ID: aabda6874c274ec47673326a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had one question out of curiosity 🗒️
208f22d
to
ceac413
Compare
17abcd6
to
5a75c3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're definitely moving in the first direction. The hash implementation will work. I don't think we should try to make too many changes that try to improve performance without actual numbers, so we have here will work a baseline I think. Making the Hash
impl of Selection
and SelectionSet
a regular method I still think is a good idea just so we don't violate the hash contract sometime in the future.
My only remaining concern is the hash collision of the FragmentGenerator
's maps. Of course the chance of the collision is near zero. If you think will be ok, we can leave it. It is just sticking out to me.
In v1 we were comparing the selections of inline fragments so technically there was chance of collision there as well (I guess it was a smaller chance as we were limiting it to the specific type condition selections). Any suggestions in how to improve our hashing to mitigate this risk? |
My thoughts would be to make |
e011999
to
d3a355a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really promising! I left a few comments that are probably not critically important. The perf tradeoffs you've made seem reasonable.
I do think that we should avoid the hash collision risk. It's unlikely, yes, but collision avoidance isn't the top priority of hashers for hashmaps, so it's a lot more likely we get a collision on accident than with a sha256 or whatever.
CI performance tests
|
Refactors the code to avoid double hashing of fragment keys (`selection key -> u64`). While the likelihood of hash collisions was low, doing double hashing does increase the (small) odds. Instead of just storing the counts of each `SelectionCountKey`, we now also store auto generated `SelectionId` to uniquely identify the fragments.
5d89f12
to
e264f4f
Compare
@Mergifyio backport 1.x |
✅ Backports have been created
|
Update `generate_fragment` algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation. Algorithm consists of two phases - recursively iterate over the selection sets and collect sub selection usages - recursively iterate over the selection sets to find which sub selections were used more than once - use this sub selection to create new fragment definition and replace current selection set with just a fragment spread Based on preliminary testing over 2.8 million ops this new algorithm * generates better results in 95% of test operations (vs 77% of old one) * p99 overhead of running this new compression algorithm results in just 10ms overhead * in case of better results, at p99 it shrinks the queries by 50Kb * in case it generates worse results, at p99 it only adds an additional 108 bytes to a query (cherry picked from commit 273ee25) # Conflicts: # apollo-router/src/plugins/connectors/tests/mod.rs
Update `generate_fragment` algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation. Algorithm consists of two phases - recursively iterate over the selection sets and collect sub selection usages - recursively iterate over the selection sets to find which sub selections were used more than once - use this sub selection to create new fragment definition and replace current selection set with just a fragment spread Based on preliminary testing over 2.8 million ops this new algorithm * generates better results in 95% of test operations (vs 77% of old one) * p99 overhead of running this new compression algorithm results in just 10ms overhead * in case of better results, at p99 it shrinks the queries by 50Kb * in case it generates worse results, at p99 it only adds an additional 108 bytes to a query
Update
generate_fragment
algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation.Algorithm consists of two phases
Based on preliminary testing over 2.8 million ops this new algorithm
Checklist
Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.
Exceptions
Note any exceptions here
Notes
Footnotes
It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩