Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add summary plots to guide notebook for Ewing annotation #1008

Merged

Conversation

allyhawkins
Copy link
Member

Purpose/implementation Section

Please link to the GitHub issue that this pull request addresses.

Continuing on #993

What is the goal of this pull request?

Here I'm filling in the section of the notebook that we will use to summarize the outputs from all the workflows, SingleR, clustering, AUCell and expression of custom marker genes. The goal is to be able to use these plots to help validate normal and tumor cell assignments. Then the next section will focus on adjusting any assignments based on these results and pulling out tumor cells for sub clustering and assignment of tumor cell states.

Briefly describe the general approach you took to achieve this goal.

  • The first part looks at the SingleR and clustering results. Here I just show UMAPs of the results. Because we won't have a set number of clusters for each sample, it's hard to define a color palette. Because of that I show the faceted UMAPs with red as the cell type or cluster in that panel rather than different colors. Personally I find the faceted UMAP easier to see which cells belong to which cluster or which cell type, but let me know if you feel differently and want to see a single compiled UMAP.
  • The next section focuses on the output from AUCell using the gene sets from MsigDB. I included a faceted UMAP where cells are colored by the AUC, which I don't love, but maybe it will be helpful for some samples. I show the density plots of the AUC values, colored by whether or not AUCell classified that gene signature as being present or not. And then the last two plots look at AUC values across clusters and cell types in a density plot and heatmap. I know this is a lot of plots, but I find both the density and the heatmap helpful here.
  • The final section repeats the same plots used for AUCell but with the mean gene expression of our custom marker gene sets rather than the AUC values from MSigDB gene sets. Again, the UMAP isn't great, but I don't think it hurts to keep in. And then I included a density plot of each marker gene set colored by whether or not the cells are considered tumor or not tumor.
  • I added a few functions to a new script, plotting-functions.R that includes functions for creating plots that are repeated (density, UMAPs, heatmaps). I envision being able to use these functions in later sections of the notebook.
  • When possible, I used functions that we already use in other notebooks rather than write entirely new functions.
  • I made some small adjustments to the setup function based on issues I ran into while plotting.

If known, do you anticipate filing additional pull requests to complete this analysis module?

Yes, next up will be the section that looks at sub clustering of tumor cells.

Provide directions for reviewers

Here's a rendered copy of the notebook for review:

celltype-exploration.html.zip

Author checklists

Analysis module and review

Reproducibility checklist

  • Code in this pull request has been added to the GitHub Action workflow that runs this module.
  • The dependencies required to run the code in this pull request have been added to the analysis module Dockerfile.
  • If applicable, the dependencies required to run the code in this pull request have been added to the analysis module conda environment.yml file.
  • If applicable, R package dependencies required to run the code in this pull request have been added to the analysis module renv.lock file.

@allyhawkins allyhawkins requested review from sjspielman and removed request for jaclyn-taroni January 28, 2025 17:18
Copy link
Member

@sjspielman sjspielman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me overall, and I don't have many comments! Just some small items...

  • I don't love auc_assignment as a label for the density plot where it appears since I don't think it's very informative. Maybe we don't even need to include AUCell in the label and just go with something like in_geneset?
  • The density plot in the section Mean expression of custom gene sets needs to be bigger to see all the strip labels
  • Some UMAPs have axis labels/ticks, and some don't. I'd remove these all around
  • I might again turn off the row clustering in the heatmaps - in other words, don't have the heatmap cluster the clusters. But if this view is more informative for you, or if you don't think it matters, then I wouldn't bother

@@ -44,6 +44,10 @@ theme_set(

# set seed
set.seed(2024)

# quiet messages
options(readr.show_col_types = FALSE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh it is so tempting to put this in my ~/.Rprofile 😂

@allyhawkins
Copy link
Member Author

@sjspielman I made all the changes you suggested in #1008 (review) in f7829aa. This should be ready for another look!

New copy of the report:
celltype-exploration.html.zip

Copy link
Member

@sjspielman sjspielman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@allyhawkins allyhawkins merged commit 2888ffb into AlexsLemonade:main Jan 30, 2025
3 checks passed
@allyhawkins allyhawkins deleted the allyhawkins/summary-plots-ewings branch January 30, 2025 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants