Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(fix): resolve data ordering to match axis for stacked violin plots #3196

Merged
merged 12 commits into from
Aug 6, 2024

Conversation

ilan-gold
Copy link
Contributor

  • Release notes not necessary because:

@ilan-gold
Copy link
Contributor Author

Will finish after lunch....wowza

@ilan-gold ilan-gold added this to the 1.10.3 milestone Aug 5, 2024
Copy link

codecov bot commented Aug 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.12%. Comparing base (bb66827) to head (f5ea470).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3196      +/-   ##
==========================================
- Coverage   76.61%   74.12%   -2.50%     
==========================================
  Files         109      109              
  Lines       12532    12497      -35     
==========================================
- Hits         9602     9263     -339     
- Misses       2930     3234     +304     
Files Coverage Δ
src/scanpy/plotting/_stacked_violin.py 84.07% <100.00%> (-0.43%) ⬇️

... and 30 files with indirect coverage changes

@ilan-gold ilan-gold changed the title (fix): resolve axis ordering for stacked violin plots (fix): resolve data ordering to match axis for stacked violin plots Aug 5, 2024
@ilan-gold
Copy link
Contributor Author

Not exactly sure how to test this - it's not that the axis is misordered, it's that we were not informing the violin plot of this ordering. I am not sure if there is a way to access the underlying data of a plot...

@flying-sheep
Copy link
Member

I have some limited experience with that plot order stuff.

There’s some compat code for old pandas versions added in #2816 which shows some spots where we mess with ordering.

Generally

  • when user specifies an order, we use that
  • if not, we rely on the DataFrame order for plotting, we don’t store this implicit order explicitly

@ilan-gold
Copy link
Contributor Author

@flying-sheep I think the code added in that PR is minimally relevant to what happened here.

when user specifies an order, we use that

Right, so here the issue is that the category ordering is used for the labelling but we were not imposing it on the data itself when the violin plots render (separate from the axis labels, as the actual violin plots are added row-by-row).

if not, we rely on the DataFrame order for plotting, we don’t store this implicit order explicitly

In some sense the above also applies. If we want to add some sort of user-facing part of the API to allow for ordering, that is fine, but I think that should be separate as it would go into the next minor release and this is a fairly large bug.

I'm fine not testing this because I genuinely don't know how and I spent a few hours yesterday trying different things to no avail.

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust you that this helps 🤷

src/scanpy/plotting/_stacked_violin.py Outdated Show resolved Hide resolved
@ilan-gold
Copy link
Contributor Author

ilan-gold commented Aug 6, 2024

I trust you that this helps 🤷

I guess we can add a test that cements this behavior, but it's tough to test that swap=original-transpose for the underlying data. I tried applying the ordering to what changed in the PR you referenced but that didn't help unfortunately. I don't know the provenance of that change, so I think this is fair. We are using the public API of seaborn in a way that fixes the underlying issue as one would expect.

@ilan-gold ilan-gold enabled auto-merge (squash) August 6, 2024 17:41
@ilan-gold ilan-gold merged commit c6766d7 into main Aug 6, 2024
13 of 14 checks passed
@ilan-gold ilan-gold deleted the ig/violin_plot_ordering branch August 6, 2024 17:59
meeseeksmachine pushed a commit to meeseeksmachine/scanpy that referenced this pull request Aug 6, 2024
flying-sheep pushed a commit that referenced this pull request Aug 7, 2024
…atch axis for stacked violin plots) (#3200)

Co-authored-by: Ilan Gold <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect column label and color assign when using 'rank_genes_groups_stacked_violin' with 'swap_axes=True'
2 participants