-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support passing index object directly into maybe_set_index #1319
feat: support passing index object directly into maybe_set_index #1319
Conversation
thanks for your PR! this would break backwards-compatibility, so I don't think we can just rename to |
That makes sense, I wondered if this was part of the "stable API" or not. Would you advise to stick with the original name ( |
one idea could be to add an extra keyword argument |
Right, so we preserve the
I'll give that a try! |
aaa1780
to
90a048d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Riik thanks for the effort! I left a minor comment and a suggestion to simplify the logic a bit.
Additionally, could we handle the case for which df
is actually a series? That's the use case I had in mind when I opened the related issue. I.e. if the native_object is a pandas like series, then I would like to modify its index and clearly that's not possible to do unless another Series/Index/iterable is passed.
Since that's the case, I think we should also raise an error if native_object
is a series but column_names
is provided
Ah yes, I see the idea now! I think that should work with the current implementation, let me add a test for it |
@FBruzzesi I've added a test case and also support for setting the index on a Series. It didn't work out of the box since pd.Series (docs) doesn't have a Also, there is the extra case where passing in I feel like this function is getting a bit unwieldy, i.e. there a lot of conditionals inside the function. Would it be an option to make a separate function |
That's great, thanks! It looks good, I just tagged Marco asking if
It still seems pretty reasonable in scope and size. I will take a deeper look tomorrow to see if there is a way to cut some corners :) |
@Riik I just merged with main branch and pushed a few changes to parse index and column_names into keys, so that we avoid branching out for dataframe/series in multiple places. |
Thank you! I think the function reads quite nicely now, with the Anything else that still needs to happen on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating - implementation looks good, just not sure about the argument name df
I did a search on https://github.com/search?q=%22maybe_set_index%28%22+language%3APython&type=code&p=1, and nobody would be affected by such a renaming. I think this is rare enough as a function that we can just change it - the only user I found it scikit-lego, who write
nw.maybe_set_index(result, ["fold", "part"])
(i.e. they pass the first argument positionally, not by name)
@FBruzzesi 's suggested obj
, as that's what we use in maybe_get_index
narwhals/utils.py
Outdated
df: object for which maybe set the index (can be either a Narwhals `DataFrame` | ||
or `Series`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a bit odd for this to be called df
when it can be a dataframe or series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Riik and @FBruzzesi !
π Thanks for the feedback and help, it was a fun PR to work on! |
What type of PR is this? (check all applicable)
Related issues
maybe_set_index
Β #1239Checklist
If you have comments or can explain your changes, please do so below.
Pandas set_index docs
Pandas set_index test cases
I've renamed the
column_names: str | list[str]
parameter tokeys: str | Series | list[Series | str]
. I referenced the Pandas docs where which havekeys: label or array-like or list of labels/arrays
. Since this is a backwards-compatability function, I figured it is good for it to be close to the Pandas spec, and the original param namecolumn
doesn't cover all cases anymore.