Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greedy query planning for large connected components #1442

Merged
merged 27 commits into from
Oct 10, 2024

Conversation

joka921
Copy link
Member

@joka921 joka921 commented Aug 14, 2024

So far, the query planning took very long when the query graph has a large connected component. With this change, the query planner counts the number of connected subgraphs for each connected component. If that number exceeds a certain budget, the query planning for the component is done using so-called Greedy Operator Ordering (GOO). For a connected component with n nodes, GOO considers a quadratic number of plans, whereas the optimal dynamic programming potentially considers an exponential number of plans. The budget can be controlled via the runtime parameter query-planning-budget (default: 1500).

It is currently enabled by an explicit Runtime parameter.
Copy link

sonarcloud bot commented Aug 14, 2024

Copy link

codecov bot commented Aug 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.33%. Comparing base (fb7b50b) to head (20a5538).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1442      +/-   ##
==========================================
+ Coverage   88.27%   88.33%   +0.05%     
==========================================
  Files         361      362       +1     
  Lines       27198    27319     +121     
  Branches     3663     3682      +19     
==========================================
+ Hits        24009    24131     +122     
+ Misses       1954     1952       -2     
- Partials     1235     1236       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a first look with Johannes. Awesome that this was possible with so little code. Speaks for the quality of the question planner code!

# Conflicts:
#	src/global/RuntimeParameters.h
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
# Conflicts:
#	src/engine/QueryPlanner.cpp
…P algorithm is very very expensive.

Signed-off-by: Johannes Kalmbach <[email protected]>
# Conflicts:
#	test/QueryPlannerTest.cpp
#	test/QueryPlannerTestHelpers.h
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
@hannahbast hannahbast changed the title Greedy query planning. Greedy query planning for "large" queries Oct 9, 2024
Signed-off-by: Johannes Kalmbach <[email protected]>
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough 1-1 with Johannes, during which I made various changes that I will commit next

@hannahbast hannahbast changed the title Greedy query planning for "large" queries Greedy query planning for large connected components Oct 9, 2024
@hannahbast hannahbast changed the title Greedy query planning for large connected components Make query planning work also for large connected components Oct 9, 2024
@hannahbast hannahbast changed the title Make query planning work also for large connected components Switch to greedy query planning for large connected components Oct 9, 2024
@hannahbast
Copy link
Member

hannahbast commented Oct 9, 2024

@joka921 Thanks a lot + looks great now. Four more nitpicks:

  1. expectDifferentPlanners looks like a misnomer (look at its many applications in test/QueryPlannerTest.cpp). Can you find a better name?
  2. How about calling the runtime parameter query-planning-budget (it's not really the budget for greedy query planning, as the current name suggests, but rather a threshold beyond which we switch to greedy query planning). Alternatives would be query-planning-threshold (which sounds a bit generic) or greedy-query-planning-threshold (which is a bit long).
  3. Shouldn't we drop the use-greedy-planning parameter now? We can always simulate it by setting the budget to zero or to a very large number
  4. Is there a test for the if (graph.size() > 64) { return budget + 1; } case? The comment currently says that we are not sure whether this condition can ever be true.

NOTE: The description of the PR above mentions the runtime parameter. Please make sure that the names are consistent.

Signed-off-by: Johannes Kalmbach <[email protected]>
Signed-off-by: Johannes Kalmbach <[email protected]>
@joka921
Copy link
Member Author

joka921 commented Oct 10, 2024

@hannahbast

  1. The default function (check with both planners) is now again called only expect (reduce the change), The other ones have explicit names that state, that you only want greedy or dynamic programming respectively.
  2. I went with query-planning-budget which seems to be consistent with the current commit message.
  3. Agreed and dropped
  4. I replaced this if by an assertion that explicitly tells us where we have to change things should we ever lift the restriction of 64 elements.

The other suggestions are from Codecov etc.

@hannahbast hannahbast changed the title Switch to greedy query planning for large connected components Greedy query planning for large connected components Oct 10, 2024
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now ready to merge (assuming the tests run through)

@hannahbast hannahbast merged commit b97c44c into ad-freiburg:master Oct 10, 2024
19 checks passed
Copy link

sonarcloud bot commented Oct 10, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants