-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add binary as numerical #19542
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for following up on this. I've left some comments.
impl_cast!(f32); | ||
impl_cast!(f64); | ||
|
||
impl Cast for f16 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support f16
, so we can drop this.
@@ -93,7 +140,8 @@ pub fn binary_to_utf8<O: Offset>( | |||
} | |||
|
|||
/// Casts a [`BinaryArray`] to a [`PrimitiveArray`], making any uncastable value a Null. | |||
pub(super) fn binary_to_primitive<O: Offset, T>( | |||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this isn't casting anymore, and we shouldn't dispatch to this via cast I think we should move this all much higher int he polars dependency chain.
This should instead be implemented in polars-ops
. Only the expressions should call into this and lower it is isn't something we should compile.
@@ -394,16 +403,26 @@ pub fn cast( | |||
match to_type { | |||
BinaryView => Ok(arr.to_binview().boxed()), | |||
LargeUtf8 => Ok(binview_to::utf8view_to_utf8::<i64>(arr).boxed()), | |||
UInt8 => binview_to_primitive_dyn::<u8>(&arr.to_binview(), to_type, options), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then we can remove all this from the cast kernels as well.
Related to #18991
Implements
from_buffer
method that casts from Binary to Numerical types. It is not the most efficient (i.e. size + reinterpret cast would probably better), but provides a consistent code for both LE and BE encodings. Name is inspired by numpy equivalent, but with polars style_
separator between words.I also added match arm in
cast
that goes fromBinary
-> numerical type. It assumes LE encoding, as it is a vastly more popular from the two. Happy to remove that part of code, since it does make an unspecified assumption. Lastly, I noticed that there is missing Numerical ->Binary
functionality. Out-of-scope for this MR IMO, but something I can do later on.