-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat[python]: Add masked array support to numpy interop API #17577
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17577 +/- ##
==========================================
- Coverage 81.00% 80.45% -0.55%
==========================================
Files 1448 1484 +36
Lines 190551 195362 +4811
Branches 2723 2778 +55
==========================================
+ Hits 154361 157185 +2824
- Misses 35687 37665 +1978
- Partials 503 512 +9 ☔ View full report in Codecov by Sentry. |
The main thing you want to test is that integer and boolean series with nulls are converted to masked arrays in a dtype-preserving way. |
@s-banach Sure, I'm looking at adding the tests for that. Would you be able to explain the current numeric conversion logic? The boolean conversion seems rather straightforward (if theres an optional it converts to |
def test_optional_int_array_to_masked() -> None: | ||
values = [1, 2, 3, 4] | ||
s = pl.Series('a', values, pl.UInt8) | ||
result = s.to_numpy() | ||
print(result.dtype) | ||
assert result.dtype == int | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WIP - will update once numeric conversion semantics are clear to me
What
Aims to close #16398. This adds a
masked
argument to thePySeries
API, which changes the output to create a numpy masked array directly, using the underlying validity buffers of the polars series.