Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spool] Introduce stage replacer and change send nodes to be able to send to more than one stage #14495

Merged
merged 12 commits into from
Nov 21, 2024

Conversation

gortiz
Copy link
Contributor

@gortiz gortiz commented Nov 19, 2024

This PR is a continuation of #14296 and the next step on #14196.

Here we define how to modify a query to replace stages that have been found equivalent. Not all equivalent stages can be substitute. Specifically, two stages that send to the same stage cannot be substituted by a spool. The reason for that is that the mailbox service assumes there the tuple (receiveStage, sendStage) is unique in the query. Probably this can be changed, but it doesn't seem to be worth it in this initial phase.

This PR also changes the equivalence function to consider not equivalent two senders with different distribution. The design algorithm includes some ways to also support this, but again, it adds a complexity that doesn't seem to be required in the first spool version.

Several tests have been added and also a new builder API to create spools in tests. Although the spool builder code is a bit ugly, it simplifies the tests that use it. There are some examples in EquivalentStagesReplacerTest.

The next PR may be the final one, where the worker protocol and operators are changed to support these new senders that can send everywhere.

@gortiz
Copy link
Contributor Author

gortiz commented Nov 19, 2024

cc @bziobrowski

@gortiz gortiz marked this pull request as ready for review November 19, 2024 15:43
@gortiz gortiz requested a review from ankitsultana November 19, 2024 15:45
@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 48.05195% with 40 lines in your changes missing coverage. Please review.

Project coverage is 63.86%. Comparing base (59551e4) to head (3b252cf).
Report is 1371 commits behind head on master.

Files with missing lines Patch % Lines
.../pinot/query/planner/plannode/MailboxSendNode.java 38.70% 15 Missing and 4 partials ⚠️
.../pinot/query/planner/plannode/PlanNodeVisitor.java 33.33% 18 Missing ⚠️
.../query/planner/logical/EquivalentStagesFinder.java 50.00% 0 Missing and 1 partial ⚠️
...che/pinot/query/planner/logical/GroupedStages.java 0.00% 1 Missing ⚠️
...not/query/planner/plannode/MailboxReceiveNode.java 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14495      +/-   ##
============================================
+ Coverage     61.75%   63.86%   +2.11%     
- Complexity      207     1570    +1363     
============================================
  Files          2436     2673     +237     
  Lines        133233   146738   +13505     
  Branches      20636    22499    +1863     
============================================
+ Hits          82274    93710   +11436     
- Misses        44911    46104    +1193     
- Partials       6048     6924     +876     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.84% <48.05%> (+2.13%) ⬆️
java-21 63.73% <48.05%> (+2.11%) ⬆️
skip-bytebuffers-false 63.85% <48.05%> (+2.10%) ⬆️
skip-bytebuffers-true 63.71% <48.05%> (+35.99%) ⬆️
temurin 63.86% <48.05%> (+2.11%) ⬆️
unittests 63.85% <48.05%> (+2.11%) ⬆️
unittests1 55.57% <48.05%> (+8.68%) ⬆️
unittests2 34.56% <0.00%> (+6.83%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Collaborator

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the non-testing changes here and will review the tests in a subsequent pass. Thanks for the patch!

@@ -77,6 +78,8 @@ public abstract SortedSet<MailboxSendNode> getGroup(MailboxSendNode stage)
*/
public abstract SortedSet<SortedSet<MailboxSendNode>> getGroups();

public abstract Set<MailboxSendNode> getStages();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small doc comment might be useful here to distinguish this method from getLeaders / getGroups.

Comment on lines +125 to +127
* The returned value of this method is ignored by default
*/
protected T preChildren(PlanNode node, C context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this a void function then? Doesn't seem like there are any future planned use cases for the return value here either?

@gortiz gortiz mentioned this pull request Nov 20, 2024
2 tasks
Copy link
Collaborator

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a few minor comments.

}

public SimpleChildBuilder<MailboxReceiveNode> newReceiver(
Function<SimpleChildBuilder<MailboxReceiveNode>, SimpleChildBuilder<MailboxReceiveNode>> customize) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for this customization function?

@gortiz gortiz merged commit cf720e3 into apache:master Nov 21, 2024
21 checks passed
@gortiz gortiz deleted the spool2 branch November 21, 2024 16:34
davecromberge pushed a commit to davecromberge/pinot that referenced this pull request Nov 22, 2024
@gortiz gortiz mentioned this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants