-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose options to Series' cut
and qcut
#1007
Conversation
allow_duplicates: false, | ||
left_close: false, | ||
include_breaks: true | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to explain these two new options - left_close
and include_breaks
, so I left out of docs. But if you have something in mind, feel free to suggest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Python docs:
left_closed
- Set the intervals to be left-closed instead of right-closed.
include_breaks
- Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.
It's probably fine if we just forward their documentation. Here's a more in depth explanation:
left_closed
The cuts which are the main argument to this function divide the real line into discrete bins. E.g. if we passed cuts [-1, 1]
, we'd break up the real line into 3 bins like this (call them A
, B
, and C
):
A B C
-------|-------|-------
-1 1
This is a bit ambiguous though. We need to decide which bins the cuts themselves belong to. Is -1 in A
or B
?
By default, the bins are "right closed", meaning the bins contain their right cut but not their left. So -1 A
and 1 B
.
Including the left_closed
argument flips this. Now the bins contain their left cut but not their right. So -1 B
and 1 C
.
include_breaks
This argument changes the return of the overall function. Instead of just mapping each value to the bin it's in, it also returns metadata about the bin it was mapped to. So you get a series of structs back instead of a category.
See their docs for an example of the structs returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, thank you @billylanchantin!! :D
I will add these short descriptions :)
Also add docs regarding the `:left_close` and `:include_breaks` options. Thanks @billylanchantin :D Reference: #1007 (comment)
…1009) * Fix `cut/3` and `qcut/3` when `:include_breaks` is false Also add docs regarding the `:left_close` and `:include_breaks` options. Thanks @billylanchantin :D Reference: #1007 (comment) * Change default of `cut/qcut` to not include breaks
Closes: #1006