Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the query result of intersect can be aggregated before intersect #59156

Open
windtalker opened this issue Jan 23, 2025 · 2 comments
Open

the query result of intersect can be aggregated before intersect #59156

windtalker opened this issue Jan 23, 2025 · 2 comments
Assignees
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@windtalker
Copy link
Contributor

windtalker commented Jan 23, 2025

Enhancement

consider the case that

select id, value from a intersect select id, value from b

From mysql's doc(https://dev.mysql.com/doc/refman/9.0/en/intersect.html), if intersect means intersect distinct by default, so the above sql is the same as

select id, value from a group by id, value intersect select id, value from b group by id, value

if there is a lots of duplicate entries in table a and table b, this aggregation can help to improve the query performance a lot, and also save the memory usage a lot.

Even if user's query is

select id, value from a intersect all select id, value from b

The query can still be rewriten as

select id, value from a intersect all select id, value from b group by id, value
@windtalker windtalker added the type/enhancement The issue or PR belongs to an enhancement. label Jan 23, 2025
@hawkingrei hawkingrei self-assigned this Jan 23, 2025
@winoros winoros added the sig/planner SIG: Planner label Jan 24, 2025
@winoros
Copy link
Member

winoros commented Jan 24, 2025

This is just part of the semi-join to inner-join rewrite.
It is trivial, but it's cost based. (The rewrite is bad when the source is something like 1m rows with 0.9 million NDV)
So it's not done yet.

@winoros
Copy link
Member

winoros commented Jan 24, 2025

And another thing we need to notify is that the intersect is generating the join whose join key is nulleq not eq.
It also needs to be specially handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

3 participants