Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(rust): Refactor compute kernels in polars-arrow to avoid using gather #19669

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Nov 6, 2024

Before we can move the gather logic to polars-compute we need to remove all uses of it in polars-arrow, as it will no longer be accessible in polars-arrow after the move.

@github-actions github-actions bot added internal An internal refactor or improvement rust Related to Rust Polars labels Nov 6, 2024
}
let take_values = unsafe {
crate::compute::take::take_unchecked(list.values().as_ref(), &indices.freeze())
Copy link
Collaborator Author

@nameexhaustion nameexhaustion Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting nullable List -> FixedSizeList, used a gather to ensure the width of the null slots - have updated this to use Growable instead.

}

let values = arr.values();
// SAFETY:
// the indices we generate are in bounds
unsafe { Ok(take_unchecked(&**values, &take_by)) }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list.get() / array.get() were building selection indices and then calling gather with them - I've re-written them to use loops instead.

out = s.arr.get(100, null_on_oob=False)

with pytest.raises(ComputeError, match="get index -3 is out of bounds"):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by - print the oob index in error message

Copy link

codecov bot commented Nov 7, 2024

Codecov Report

Attention: Patch coverage is 91.92547% with 13 lines in your changes missing coverage. Please review.

Project coverage is 79.73%. Comparing base (8335f75) to head (5dff00a).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...polars-arrow/src/legacy/kernels/fixed_size_list.rs 87.37% 13 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #19669   +/-   ##
=======================================
  Coverage   79.72%   79.73%           
=======================================
  Files        1542     1542           
  Lines      212208   212232   +24     
  Branches     2449     2449           
=======================================
+ Hits       169182   169220   +38     
+ Misses      42472    42458   -14     
  Partials      554      554           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46
Copy link
Member

There might be some performance implications in those rewrites. It's hard to tell. I think we actually should move the cast to polars-compute as well. Then the dependency problem is resolved.

@nameexhaustion
Copy link
Collaborator Author

There might be some performance implications in those rewrites. It's hard to tell. I think we actually should move the cast to polars-compute as well. Then the dependency problem is resolved.

I tried to move the casting code, but I don't think it's possible as the cast is currently used by ArrowArray::new() in the polars-arrow crate -

let variadic_buffer_sizes = if needs_variadic_buffer_sizes {
#[cfg(feature = "compute_cast")]
{
let arr = crate::compute::cast::cast_unchecked(

From benchmarking, the PR as it is improves list->array casting performance, while regressing on list.get() / array.get() performance -

# DF
shape: (20_000_000, 2)
┌──────────┬─────────────────────────────────┐
│ i64list                            │
│ ------                             │
│ i64list[i64]                       │
╞══════════╪═════════════════════════════════╡
│ 6765403  ┆ [6765403, 6765403, … 6765403]   │
│ 16059030 ┆ [16059030, 16059030, … 1605903… │
...
# This PR
cast list->array 0.7143477080389857
list.get(i64) 0.2533301250077784
arr.get(col(indices)) 0.860774208791554
# 1.12.0
cast list->array 1.0722814579494298
list.get(i64) 0.2003235830925405
arr.get(col(indices)) 0.5676086670719087

I think for list.get() / array.get(), switching to the growable introduced dynamic dispatch for every row, is the cause of the performance regression. On the other hand cast() performance improved as it was already dynamic dispatch, but we removed the extra step of materializing selection indices.

I think, maybe I can leave the cast in polars-arrow, but spend some time resolving the list.get() performance, before continuing with moving the gather() to polars-compute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants