Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vectorized aggregation on Hypercore TAM #7655

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

erimatnor
Copy link
Contributor

@erimatnor erimatnor commented Feb 4, 2025

Add support for vectorized aggregation over Hypercore TAM. This includes some refactoring of the VectorAgg node in order to plan vectorized aggregation on top of ColumnarScans.

Currently, only ColumnarScan can run below VectorAgg, because it is doing qual filtering. In theory, a SeqScan reading from Hypercore TAM should also work because it would produce Arrow slots. However, a SeqScan doesn't do vectorized filtering, which is currently assumed to be done before the VectorAgg node.

In ColumnarScan, it necessary to turn off projection when VectorAgg is used. Otherwise, it would project the arrow slot into a virtual slot, thus losing the vector data. Ideally, a projection should never be planned to begin with, but this isn't possible since VectorAgg relies on replacing existing non-vectorized Agg plans added by PostgreSQL.

The existing vectoragg tests can be run with TAM enabled by default and in that case the tests produces the same vectoragg results as without TAM.

Closes: #7654

Disable-check: commit-count

Copy link

codecov bot commented Feb 4, 2025

Codecov Report

Attention: Patch coverage is 91.26214% with 9 lines in your changes missing coverage. Please review.

Project coverage is 81.33%. Comparing base (59f50f2) to head (20de575).
Report is 734 commits behind head on main.

Files with missing lines Patch % Lines
tsl/src/nodes/vector_agg/exec.c 86.48% 0 Missing and 5 partials ⚠️
tsl/src/nodes/vector_agg/plan.c 93.02% 0 Missing and 3 partials ⚠️
tsl/src/nodes/vector_agg/vector_slot.h 95.45% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7655      +/-   ##
==========================================
+ Coverage   80.06%   81.33%   +1.27%     
==========================================
  Files         190      242      +52     
  Lines       37181    44902    +7721     
  Branches     9450    11198    +1748     
==========================================
+ Hits        29770    36523    +6753     
- Misses       2997     3991     +994     
+ Partials     4414     4388      -26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@erimatnor erimatnor force-pushed the hypercore-vectoragg branch from 486e2b1 to 270fd54 Compare February 4, 2025 15:20
@erimatnor erimatnor force-pushed the hypercore-vectoragg branch from 270fd54 to 232df0a Compare February 6, 2025 09:11
@erimatnor erimatnor force-pushed the hypercore-vectoragg branch 7 times, most recently from fabdb44 to efd1d0e Compare February 6, 2025 13:05
Add support for vectorized aggregation over Hypercore TAM. This
includes some refactoring of the VectorAgg node in order to plan
vectorized aggregation on top of ColumnarScans.

Currently, only ColumnarScan can run below VectorAgg, because it is
doing qual filtering. In theory, a SeqScan reading from Hypercore TAM
should also work because it would produce Arrow slots. However, a
SeqScan doesn't do vectorized filtering, which is currently assumed to
be done before the VectorAgg node.

In ColumnarScan, it necessary to turn off projection when VectorAgg is
used. Otherwise, it would project the arrow slot into a virtual slot,
thus losing the vector data. Ideally, a projection should never be
planned to begin with, but this isn't possible since VectorAgg relies
on replacing existing non-vectorized Agg plans added by PostgreSQL.
@erimatnor erimatnor force-pushed the hypercore-vectoragg branch from efd1d0e to 20de575 Compare February 6, 2025 13:47
@erimatnor erimatnor marked this pull request as ready for review February 6, 2025 16:53
@erimatnor erimatnor requested a review from akuzm February 6, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement]: Support vectorized aggregation with Hypercore TAM
2 participants