-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(feat): Aggregation via group-by in sc.get
#2590
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #2590 +/- ##
==========================================
+ Coverage 74.62% 74.83% +0.20%
==========================================
Files 115 116 +1
Lines 12763 12891 +128
==========================================
+ Hits 9525 9647 +122
- Misses 3238 3244 +6
|
GroupBy
Aggregation in sc.get
sc.get
sc.get
sc.get
Ok @ivirshup I think we're ready to go here. Thanks for the guidance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please either revert 0aef147 or allow both as in #2771, or wait for scverse/anndata#1245
also as I asked before: why go away from dataclasses?
I don't think that switching away from data classes removed any meaningful functionality here, but having to use I don't think that there being some internal data classes is important here, especially since it's not user visible and may change at any time anyways. I have a few ideas for ways to change the implementation to add more methods, none of which are compatible with
Is there some functionality the data class was adding that I'm missing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding data classes, as said in person: I tend to use them when
- they basically just mean I can delete the
__init__
function and maybe add a= field(...)
- instances of the class actually end up as dict keys, being compared or so
The former is almost the case, at the time I wrote it, it was the case.
Let’s get the consensus axis
design into this!
|
I think it will be used at some point, but also happy to remove. I think parameterizing |
OK, good to know! Then this PR is fine as far as I’m concerned. |
Modified (for
scanpy
) version of scverse/anndata#564. Fixes scverse/anndata#556Big points of change:
obs
andvar
group-by +varm
,obsm
,layers
as options for data to aggregateAnnData
object instead ofDataFrame
scanpy
-style public APITODO (by @ivirshup):
Necessary:
"nonzero"
variations, should this be"nonzero_count"
so it's a little like"nanmean"
Optional, can do later:
fill_value
argument for those valuesobsm
,varm
nan*
variations)