-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathscDotPlot.Rmd
196 lines (149 loc) · 6.42 KB
/
scDotPlot.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
---
title: "scDotPlot"
author:
- name: "Ben Laufer"
output:
BiocStyle::html_document:
toc_float: TRUE
date: "`r doc_date()`"
package: "`r pkg_ver('scDotPlot')`"
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{scDotPlot}
%\VignetteEncoding{UTF-8}
---
```{r Setup, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE,
message = FALSE,
warning = FALSE,
crop = NULL)
```
# Introduction
Dot plots of single-cell RNA-seq data allow for an examination of the relationships between cell groupings (e.g. clusters) and marker gene expression. The scDotPlot package offers a unified approach to perform a hierarchical clustering analysis and add annotations to the columns and/or rows of a scRNA-seq dot plot. It works with SingleCellExperiment and Seurat objects as well as data frames. The `scDotPlot()` function uses data from `scater::plotDots()` or `Seurat::DotPlot()` along with the `aplot` package to add dendrograms from `ggtree` and optional annotations.
# Installation
```{r Install, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("scDotPlot")
```
To install the development version directly from GitHub:
```{r Install GitHub, eval = FALSE}
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("ben-laufer/scDotPlot")
```
# SingleCellExperiment
## Prepare object
First, we normalize the object and then, for the purpose of this example, subset to remove cells without cell-type labels.
```{r Prepare SingleCellExperiment}
library(scRNAseq)
library(scuttle)
sce <- ZeiselBrainData()
sce <- sce |>
logNormCounts() |>
subset(x = _, , level2class != "(none)")
```
## Get features
The features argument accepts a character vector with the gene IDs. For this example, we quickly obtain the top markers of for each cell type and then add them to the rowData of the object.
```{r Get features SingleCellExperiment}
library(scran)
library(purrr)
library(dplyr)
library(AnnotationDbi)
features <- sce |>
scoreMarkers(sce$level1class) |>
map(~ .x |>
as.data.frame() |>
arrange(desc(mean.AUC))|>
dplyr::slice(1:6) |>
rownames()) |>
unlist2()
rowData(sce)$Marker <- features[match(rownames(sce), features)] |>
names()
```
## Plot logcounts
Finally, we create the plot. The `group` arguments utilize the colData, while the `features` arguments use the rowData. The `paletteList` argument can be used to manually specify the colors for the annotations specified through `groupAnno` and `featureAnno`. The clustering of the columns shows that cell the cell sub-types cluster by cell-type, while the clustering of the rows shows that most of the markers clusters by their cell type.
```{r scePlot1, fig.cap = "scDotPlot of SingleCellExperiment logcounts", fig.width=12, fig.height=12, dpi=50}
library(scDotPlot)
library(ggsci)
sce |>
scDotPlot(features = features,
group = "level2class",
groupAnno = "level1class",
featureAnno = "Marker",
groupLegends = FALSE,
annoColors = list("level1class" = pal_d3()(7),
"Marker" = pal_d3()(7)),
annoWidth = 0.02)
```
## Plot Z-scores
Plotting by Z-score through `scale = TRUE` improves the clustering result for the rows.
```{r scePlot2, fig.cap = "scDotPlot of SingleCellExperiment Z-scores", fig.width=12, fig.height=12, dpi=50}
sce |>
scDotPlot(scale = TRUE,
features = features,
group = "level2class",
groupAnno = "level1class",
featureAnno = "Marker",
groupLegends = FALSE,
annoColors = list("level1class" = pal_d3()(7),
"Marker" = pal_d3()(7)),
annoWidth = 0.02)
```
# Seurat
## Get features
After loading the example Seurat object, we find the top markers for each cluster and add them to the assay of interest.
```{r Get features Seurat}
library(Seurat)
library(SeuratObject)
library(tibble)
data("pbmc_small")
features <- pbmc_small |>
FindAllMarkers(only.pos = TRUE, verbose = FALSE) |>
group_by(cluster) |>
dplyr::slice(1:6) |>
dplyr::select(cluster, gene)
pbmc_small[[DefaultAssay(pbmc_small)]][[]] <- pbmc_small[[DefaultAssay(pbmc_small)]][[]] |>
rownames_to_column("gene") |>
left_join(features, by = "gene") |>
column_to_rownames("gene")
features <- features |>
deframe()
```
## Plot logcounts
Plotting a Seurat object is similar to plotting a SingleCellExperiment object.
```{r SeuratPlot1, fig.cap = "scDotPlot of Seurat logcounts", fig.width=4, fig.height=5, out.width="75%", out.height="75%", dpi=50}
pbmc_small |>
scDotPlot(features = features,
group = "RNA_snn_res.1",
groupAnno = "RNA_snn_res.1",
featureAnno = "cluster",
annoColors = list("RNA_snn_res.1" = pal_d3()(7),
"cluster" = pal_d3()(7)),
groupLegends = FALSE,
annoWidth = 0.075)
```
## Plot Z-scores
Again, we see that plotting by Z-score improves the clustering result for the rows.
```{r SeuratPlot2, fig.cap = "scDotPlot of Seurat Z-scores", fig.width=4, fig.height=5, out.width="75%", out.height="75%", dpi=50}
pbmc_small |>
scDotPlot(scale = TRUE,
features = features,
group = "RNA_snn_res.1",
groupAnno = "RNA_snn_res.1",
featureAnno = "cluster",
annoColors = list("RNA_snn_res.1" = pal_d3()(7),
"cluster" = pal_d3()(7)),
groupLegends = FALSE,
annoWidth = 0.075)
```
# Package support
The [Bioconductor support site](https://support.bioconductor.org/) is the preferred method to ask for help. Before posting, it's recommended to check [previous posts](https://support.bioconductor.org/tag/scDotPlot/) for the answer and look over the [posting guide](http://www.bioconductor.org/help/support/posting-guide/). For the post, it's important to use the `scDotPlot` tag and provide both a minimal reproducible example and session information.
# Acknowledgement
This package was inspired by the [single-cell example from aplot](https://yulab-smu.top/pkgdocs/aplot.html#a-single-cell-example).
# Session info
```{r Session info, echo=FALSE}
sessionInfo()
```