-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Expressify str.strip_prefix & suffix #11197
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -334,6 +334,46 @@ pub trait Utf8NameSpaceImpl: AsUtf8 { | |
Ok(builder.finish()) | ||
} | ||
|
||
fn strip_prefix(&self, prefix: &Utf8Chunked) -> Utf8Chunked { | ||
let ca = self.as_utf8(); | ||
match prefix.len() { | ||
1 => match prefix.get(0) { | ||
Some(prefix) => { | ||
ca.apply_generic(|opt_s| opt_s.map(|s| s.strip_prefix(prefix).unwrap_or(s))) | ||
}, | ||
_ => Utf8Chunked::full_null(ca.name(), ca.len()), | ||
}, | ||
_ => binary_elementwise( | ||
ca, | ||
prefix, | ||
|opt_s: Option<&str>, opt_prefix: Option<&str>| match (opt_s, opt_prefix) { | ||
(Some(s), Some(prefix)) => Some(s.strip_prefix(prefix).unwrap_or(s)), | ||
_ => None, | ||
}, | ||
), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the log make the compiler unhappy. error: lifetime may not live long enough
--> /Users/reswqa/code/rust/polars/crates/polars-ops/src/chunked_array/strings/namespace.rs:350:48
|
349 | |opt_s: Option<&str>, opt_prefix: Option<&str>| match (opt_s, opt_prefix) {
| - - return type of closure is std::option::Option<&'2 str>
| |
| let's call the lifetime of this reference `'1`
350 | (Some(s), Some(prefix)) => Some(s.strip_prefix(prefix).unwrap_or(s)),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2` There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you make this closure an actual function and annotate the lifetimes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you means some thing like this? fn strip_prefix_op<'a>(opt_s: Option<&'a str>, opt_prefix: Option<&'a str>) -> Option<& 'a str> {
match (opt_s, opt_prefix) {
(Some(s), Some(prefix)) => Some(s.strip_prefix(prefix).unwrap_or(s)),
_ => None,
}
} and rewrite to binary_elementwise(ca, prefix,strip_prefix_op) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pretty much. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But this given me: error[E0308]: mismatched types
--> /Users/reswqa/code/rust/polars/crates/polars-ops/src/chunked_array/strings/namespace.rs:359:18
|
359 | _ => binary_elementwise(
| __________________^
360 | | ca,
361 | | prefix,
362 | | strip_prefix_op
363 | | ),
| |_____________^ one type is more general than the other
|
= note: expected enum `std::option::Option<&'a str>`
found enum `std::option::Option<&str>`
note: the lifetime requirement is introduced here
--> /Users/reswqa/code/rust/polars/crates/polars-core/src/chunked_array/ops/arity.rs:20:75
|
20 | F: for<'a> FnMut(Option<T::Physical<'a>>, Option<U::Physical<'a>>) -> Option<K>, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will take a look tomorrow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! |
||
} | ||
} | ||
|
||
fn strip_suffix(&self, suffix: &Utf8Chunked) -> Utf8Chunked { | ||
let ca = self.as_utf8(); | ||
match suffix.len() { | ||
1 => match suffix.get(0) { | ||
Some(suffix) => { | ||
ca.apply_generic(|opt_s| opt_s.map(|s| s.strip_suffix(suffix).unwrap_or(s))) | ||
}, | ||
_ => Utf8Chunked::full_null(ca.name(), ca.len()), | ||
}, | ||
_ => binary_elementwise( | ||
ca, | ||
suffix, | ||
|opt_s: Option<&str>, opt_suffix: Option<&str>| match (opt_s, opt_suffix) { | ||
(Some(s), Some(suffix)) => Some(s.strip_suffix(suffix).unwrap_or(s)), | ||
_ => None, | ||
}, | ||
), | ||
} | ||
} | ||
|
||
fn split(&self, by: &str) -> ListChunked { | ||
let ca = self.as_utf8(); | ||
let mut builder = ListUtf8ChunkedBuilder::new(ca.name(), ca.len(), ca.get_values_size()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I feel uncomfortable with this approach(
Helper
is for HRTB of the output type ofFnMut
). But I didn't come up with a good solution to the lifetime issue here in a short period of time. I guess there may be a very simple way, perhaps I overestimated this 😞 Any input or suggestion about this will be helpful to me. :)Next comment has specific error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definition shouldn't merge it like this with this helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also oppose this approach. But It's just for the convenience of bringing up the problem and seeing if there are any good ways to handle this. Or rather, it's throwing bricks to attract jade. :)
I know we have some way to bypass this issue, but I would rather make
binary_elementwise
happy with this, especially in the case of closures. If there is really no good solution, I will revert this part of the changes and change it to a way that does not require modifying this function.