improve generate fragment compression #6651

dariuszkuc · 2025-01-27T05:25:40Z

Update generate_fragment algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation.

Algorithm consists of two phases

recursively iterate over the selection sets and collect sub selection usages
recursively iterate over the selection sets to find which sub selections were used more than once
- use this sub selection to create new fragment definition and replace current selection set with just a fragment spread

Based on preliminary testing over 2.8 million ops this new algorithm

generates better results in 95% of test operations (vs 77% of old one)
p99 overhead of running this new compression algorithm results in just 10ms overhead
in case of better results, at p99 it shrinks the queries by 50Kb
in case it generates worse results, at p99 it only adds an additional 108 bytes to a query

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Note any exceptions here

Notes

It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩

github-actions · 2025-01-27T05:25:55Z

@dariuszkuc, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

svc-apollo-docs · 2025-01-27T05:26:23Z

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: aabda6874c274ec47673326a

briannafugate408

Just had one question out of curiosity 🗒️

apollo-router/src/configuration/mod.rs

apollo-federation/src/operation/optimize.rs

apollo-federation/src/operation/selection_map.rs

apollo-federation/src/query_plan/query_planner.rs

apollo-federation/src/operation/optimize.rs

TylerBloom

We're definitely moving in the first direction. The hash implementation will work. I don't think we should try to make too many changes that try to improve performance without actual numbers, so we have here will work a baseline I think. Making the Hash impl of Selection and SelectionSet a regular method I still think is a good idea just so we don't violate the hash contract sometime in the future.

My only remaining concern is the hash collision of the FragmentGenerator's maps. Of course the chance of the collision is near zero. If you think will be ok, we can leave it. It is just sticking out to me.

apollo-federation/src/operation/mod.rs

dariuszkuc · 2025-01-29T17:48:37Z

My only remaining concern is the hash collision of the FragmentGenerator's maps. Of course the chance of the collision is near zero. If you think will be ok, we can leave it. It is just sticking out to me.

In v1 we were comparing the selections of inline fragments so technically there was chance of collision there as well (I guess it was a smaller chance as we were limiting it to the specific type condition selections). Any suggestions in how to improve our hashing to mitigate this risk?

TylerBloom · 2025-01-29T18:56:58Z

Any suggestions in how to improve our hashing to mitigate this risk?

My thoughts would be to make NamedFragmentCandidateKey the key to the maps. To avoid hashing multiple times, you can cache the hash and just derive PartialEq. That said, it sounds like a non-issue.

apollo-federation/src/operation/mod.rs

apollo-federation/src/operation/selection_map.rs

apollo-federation/src/operation/mod.rs

goto-bus-stop

This looks really promising! I left a few comments that are probably not critically important. The perf tradeoffs you've made seem reasonable.

I do think that we should avoid the hash collision risk. It's unlikely, yes, but collision avoidance isn't the top priority of hashers for hashmaps, so it's a lot more likely we get a collision on accident than with a sha256 or whatever.

apollo-federation/src/operation/optimize.rs

router-perf · 2025-01-31T17:32:45Z

CI performance tests

Refactors the code to avoid double hashing of fragment keys (`selection key -> u64`). While the likelihood of hash collisions was low, doing double hashing does increase the (small) odds. Instead of just storing the counts of each `SelectionCountKey`, we now also store auto generated `SelectionId` to uniquely identify the fragments.

Addressed

dariuszkuc · 2025-02-03T21:21:03Z

@Mergifyio backport 1.x

mergify · 2025-02-03T21:21:07Z

backport 1.x

✅ Backports have been created

#6722 improve generate fragment compression (backport #6651) has been created for branch 1.x but encountered conflicts

Update `generate_fragment` algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation. Algorithm consists of two phases - recursively iterate over the selection sets and collect sub selection usages - recursively iterate over the selection sets to find which sub selections were used more than once - use this sub selection to create new fragment definition and replace current selection set with just a fragment spread Based on preliminary testing over 2.8 million ops this new algorithm * generates better results in 95% of test operations (vs 77% of old one) * p99 overhead of running this new compression algorithm results in just 10ms overhead * in case of better results, at p99 it shrinks the queries by 50Kb * in case it generates worse results, at p99 it only adds an additional 108 bytes to a query (cherry picked from commit 273ee25) # Conflicts: # apollo-router/src/plugins/connectors/tests/mod.rs

Update `generate_fragment` algorithm to extract named fragments from common sub selections in the given operation. Fragments are now auto generated for each sub selection that occurs at least twice within the final fetch operation. Algorithm consists of two phases - recursively iterate over the selection sets and collect sub selection usages - recursively iterate over the selection sets to find which sub selections were used more than once - use this sub selection to create new fragment definition and replace current selection set with just a fragment spread Based on preliminary testing over 2.8 million ops this new algorithm * generates better results in 95% of test operations (vs 77% of old one) * p99 overhead of running this new compression algorithm results in just 10ms overhead * in case of better results, at p99 it shrinks the queries by 50Kb * in case it generates worse results, at p99 it only adds an additional 108 bytes to a query

dariuszkuc changed the title ~~improve generate fragment compression~~ [DRAFT] improve generate fragment compression Jan 27, 2025

briannafugate408 reviewed Jan 27, 2025

View reviewed changes

apollo-router/src/configuration/mod.rs Outdated Show resolved Hide resolved

TylerBloom reviewed Jan 27, 2025

View reviewed changes

dariuszkuc force-pushed the gen_fragment_v2 branch 2 times, most recently from 208f22d to ceac413 Compare January 28, 2025 21:49

dariuszkuc changed the title ~~[DRAFT] improve generate fragment compression~~ improve generate fragment compression Jan 29, 2025

dariuszkuc force-pushed the gen_fragment_v2 branch from 17abcd6 to 5a75c3d Compare January 29, 2025 06:47

dariuszkuc marked this pull request as ready for review January 29, 2025 06:47

dariuszkuc requested review from sachindshinde, SimonSapin, lrlna and duckki as code owners January 29, 2025 06:47

TylerBloom reviewed Jan 29, 2025

View reviewed changes

apollo-federation/src/operation/mod.rs Show resolved Hide resolved

apollo-federation/src/operation/mod.rs Show resolved Hide resolved

TylerBloom approved these changes Jan 29, 2025

View reviewed changes

dariuszkuc requested review from a team as code owners January 30, 2025 00:04

dariuszkuc force-pushed the gen_fragment_v2 branch from e011999 to d3a355a Compare January 30, 2025 02:59

dariuszkuc requested a review from a team as a code owner January 30, 2025 02:59

goto-bus-stop reviewed Jan 31, 2025

View reviewed changes

apollo-federation/src/operation/mod.rs Outdated Show resolved Hide resolved

goto-bus-stop reviewed Jan 31, 2025

View reviewed changes

apollo-federation/src/operation/selection_map.rs Show resolved Hide resolved

goto-bus-stop reviewed Jan 31, 2025

View reviewed changes

apollo-federation/src/operation/mod.rs Show resolved Hide resolved

goto-bus-stop previously requested changes Jan 31, 2025

View reviewed changes

apollo-federation/src/operation/optimize.rs Outdated Show resolved Hide resolved

dariuszkuc added 3 commits January 31, 2025 21:18

improve generate fragment compression

d7e242b

code review

b1b015e

fix: remove redundant parent type position information

c84df1a

dariuszkuc and others added 14 commits January 31, 2025 21:18

fix SelectionMap hash should be order independent

7f2db49

fix tests

06d7742

fix hashing of selection sets

ed75895

lint

421a5d3

drop old gen fragment algorithm

dbc30d7

missed file

3eac75a

update mocks with new gen fragment algorithm

4bfa93a

update fragment info in some tests

1f9b64c

Update subgraph mocks for new fragments

3e94cb9

Update fragments in entity cache invalidation test

21ed580

Update fragments in core defer test

a4d0799

Update fragments in benchmark suite

13b5986

Update fragments in entity cache + defer sample test

32167cf

dariuszkuc force-pushed the gen_fragment_v2 branch from 5d89f12 to e264f4f Compare February 1, 2025 03:19

dariuszkuc added 2 commits January 31, 2025 21:22

lint

8785d15

more lint...

8574659

goto-bus-stop self-requested a review February 3, 2025 09:01

dariuszkuc added the backport-1.x Backport this PR to 1.x label Feb 3, 2025

lrlna approved these changes Feb 3, 2025

View reviewed changes

TylerBloom approved these changes Feb 3, 2025

View reviewed changes

dariuszkuc merged commit 273ee25 into dev Feb 3, 2025
16 checks passed

dariuszkuc deleted the gen_fragment_v2 branch February 3, 2025 20:49

mergify bot mentioned this pull request Feb 3, 2025

improve generate fragment compression (backport #6651) #6722

Closed

6 tasks

dariuszkuc mentioned this pull request Feb 4, 2025

backport 1.x: improve generate fragment compression #6723

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve generate fragment compression #6651

improve generate fragment compression #6651

dariuszkuc commented Jan 27, 2025 •

edited

Loading

github-actions bot commented Jan 27, 2025

svc-apollo-docs commented Jan 27, 2025 •

edited

Loading

briannafugate408 left a comment

TylerBloom left a comment

dariuszkuc commented Jan 29, 2025

TylerBloom commented Jan 29, 2025

goto-bus-stop left a comment

router-perf bot commented Jan 31, 2025

dariuszkuc commented Feb 3, 2025

mergify bot commented Feb 3, 2025 •

edited

Loading

improve generate fragment compression #6651

improve generate fragment compression #6651

Conversation

dariuszkuc commented Jan 27, 2025 • edited Loading

Footnotes

github-actions bot commented Jan 27, 2025

svc-apollo-docs commented Jan 27, 2025 • edited Loading

✅ Docs preview has no changes

briannafugate408 left a comment

Choose a reason for hiding this comment

TylerBloom left a comment

Choose a reason for hiding this comment

dariuszkuc commented Jan 29, 2025

TylerBloom commented Jan 29, 2025

goto-bus-stop left a comment

Choose a reason for hiding this comment

router-perf bot commented Jan 31, 2025

dariuszkuc commented Feb 3, 2025

mergify bot commented Feb 3, 2025 • edited Loading

✅ Backports have been created

dariuszkuc commented Jan 27, 2025 •

edited

Loading

svc-apollo-docs commented Jan 27, 2025 •

edited

Loading

mergify bot commented Feb 3, 2025 •

edited

Loading