-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Iceberg table count #46525
Comments
We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks! |
Still relevant. |
@Samrose-Ahmed #43616 maybe you can try this pr it's merged into 3.2. |
but that pr does not rely on iceberg metadata. however it can speed up a lot, andit's a general optimization(both works any table format). |
Thanks much!, that's a good PR in general, I just tried it and it does improve but it can be much faster for Iceberg (almost instant) because we have metadata. |
Enhancement
One can obtain the
count(*)
for an iceberg table from the Iceberg metadata without having to do a full scan of the data. Currently, Starrocks performs a full scan of the Iceberg Table data when doing a count(*) query on external Iceberg lake table. This should be optimized to just use the Iceberg metadata (this is already available via the statistics).E.g.
The
cardinality
in theIcebergScanNode
already has the result it does not need to perform any scan.The text was updated successfully, but these errors were encountered: