Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultiQC report #40

Open
grst opened this issue May 26, 2023 · 13 comments
Open

Add MultiQC report #40

grst opened this issue May 26, 2023 · 13 comments
Assignees
Labels
enhancement Improvement for existing functionality

Comments

@grst
Copy link
Member

grst commented May 26, 2023

Description of feature

Having individual QC reports for each sample is nice, but it would be cool to have one aggregated report that gives an overview of all samples to quickly identify problematic ones.

MultiQC is an obvious choice here, but depending on how much customization is necessary, a custom notebook would also be an option.

@grst grst added the enhancement Improvement for existing functionality label May 26, 2023
@fasterius
Copy link
Collaborator

While MultiQC is indeed a natural choice for aggregation of QC metrics and statistics, I wonder if it's appropriate here. MultiQC usually takes the pre-defined output of specific tools; the output of the reports and their content is not from a single tool and might change in the future. I'm thinking that an additional Quarto report might be more appropriate here, but this can certainly be discussed. Thoughts, @cavenel ?

@grst
Copy link
Member Author

grst commented Jun 5, 2023

I'm open to both solutions. It will definitely be quicker to get something up and running with another quarto report. It might also need quite some custom code for parsing summary statistics, and at some point better to have a separate package.

For additional context, for single-cell data, there's the checkatlas package which generates a MultiQC report, and we're considering to add it to the #scrnaseq pipeline (nf-core/scrnaseq#80). Here's an example report:
https://checkatlas.readthedocs.io/en/stable/CheckAtlas_example_2/CheckAtlas_example_2.html

So one option could be to make a similar package, or add spatial support to checkatlas.
CC @drbecavin

@fasterius
Copy link
Collaborator

I do like the general idea of using MultiQC, if only because it's so common for nf-core pipelines, but it's probably a lot more effort. While getting another Quarto report might be a good solution for now, it can certainly be discussed whether to spend the effort to create some interface for MultiQC in the future. As it is, the current Quarto reports could also get some additional work and prettifying in addition to what we already have.

I hadn't seen checkatlas before, thanks for sharing! Using something that already exists is always a good solution, if it can interface with spatial stuff.

@cavenel
Copy link
Collaborator

cavenel commented Jun 9, 2023

I like the idea of using checkatlas, as it can generate the MultiQC output directly from the AnnData outputs of all samples. I wonder what kind of extra QC we could add from spatial though.
Maybe as a first step, adding checkatlas as is would already be nice!

@grst
Copy link
Member Author

grst commented Jun 14, 2023

I wonder, if the multiqc cellranger module could directly work with spaceranger html reports as well. This multiqc module focues more on alignment statistics, so it would be valuable to have in addition to checkatlas.

@grst
Copy link
Member Author

grst commented Jun 29, 2023

I wonder, if the multiqc cellranger module could directly work with spaceranger html reports as well

It doesn't, but it should be relatively straightfoward to adapt its code to add a separate spaceranger module to multiqc. Maybe I can look into that.

My plan for improving QC for spatialtranscriptomics:

  1. Add MultiQC and FastQC to setup a basic QC workflow
  2. implement spaceranger module in multiqc
  3. look into custom content for multiqc reports or checkatlas

@grst grst self-assigned this Jun 29, 2023
@grst grst mentioned this issue Jun 29, 2023
10 tasks
@fasterius
Copy link
Collaborator

Getting a Space Ranger module for MultiQC would be nice! That coupled with some custom content from the reports would be a nice starting point to build on.

@grst
Copy link
Member Author

grst commented Jul 11, 2023

MultiQC module is ready, waiting for review: MultiQC/MultiQC#1945

@ducminhnguyenle
Copy link

ducminhnguyenle commented Oct 26, 2023

I have recently tried your spaceranger multiqc module. The multiqc report looks very nice. I just want to ask that is it possible to also include the "Gene and UMI Distribution" violin plot in the 10X report and parse it into multiqc_report as a normal boxplot? Thank you. @grst

@grst
Copy link
Member Author

grst commented Feb 29, 2024

Multiqc module now released: https://github.com/MultiQC/MultiQC/releases/tag/v1.21

@fasterius
Copy link
Collaborator

Where are we on this issue at the moment? The MultiQC Space Ranger module works great for that, but obviously only contains the Space Ranger-specific QC metrics. Given that checkatlas does not seem to be a simple solution plus seemingly not being maintained (last update was 9 months ago) that's probably not the way to go. Adding another Quarto report to collect some of the plots could be something that might help provide an overview of the downstream analyses. Other ideas?

@grst
Copy link
Member Author

grst commented May 21, 2024

I've seen some activity on checkatlas in the issue tracker lately, but I agree it's not a short-term solution. I think it would be nice if any other QC metrics would be part of the multiqc report, to have a single location to check. This would be possible via a custom script + multiqc custom content.

Overall I think the spaceranger metrics are already quite helpful, more would be nice, but not sure it's high priority.

@fasterius
Copy link
Collaborator

I have now added functionality to get the QC metrics from the quality control report into MultiQC as custom content (see mention above), so now the question becomes if we're happy with this for now or if we want to add more.

Adding other interesting metrics should be easy to do in a similar manner if we want to, but I was wondering about figures. Since we can't know how many samples somebody would run at the same time, I'm not sure if adding images (e.g. QC violin plots, UMAPs or spatial visualisations) would be scalable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement for existing functionality
Projects
None yet
Development

No branches or pull requests

4 participants