forked from microbiome/tutorials
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathStability.Rmd
executable file
·150 lines (108 loc) · 5.22 KB
/
Stability.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
title: "Microbiome stability analysis"
author: "Leo Lahti, Sudarshan Shetty et al."
bibliography:
- bibliography.bib
output:
BiocStyle::html_document:
number_sections: no
toc: yes
toc_depth: 4
toc_float: true
self_contained: true
thumbnails: true
lightbox: true
gallery: true
use_bookdown: false
highlight: haddock
---
<!--
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{microbiome tutorial - stability}
%\usepackage[utf8]{inputenc}
%\VignetteEncoding{UTF-8}
-->
Get example data - [HITChip Atlas of 130 genus-like taxa across 1006 healthy western adults](http://www.nature.com/ncomms/2014/140708/ncomms5344/full/ncomms5344.html). A subset of 76 subjects have also short time series available for temporal stability analysis:
```{r bistability, message=FALSE, warning=FALSE, error=FALSE}
# Load the example data
set.seed(134)
library(microbiome)
library(dplyr)
data(atlas1006)
# Rename the example data
pseq <- atlas1006
# Focus on specific subset
pseq <- pseq %>% subset_samples(DNA_extraction_method == "r")
# Use relative abundances
pseq <- microbiome::transform(pseq, "compositional")
# Merge rare taxa to speed up examples
pseq <- aggregate_rare(pseq, level = "Genus", detection = .1/100, prevalence = 10/100)
# For cross-sectional analysis, use only the baseline time point:
pseq0 <- baseline(pseq)
```
## Intermediate stability quantification
It has been reported that certain microbial groups exhibit bi-stable
abundance distributions with distinct peaks at low and high
abundances, and an instable intermediate abundance range. Instability
at the intermediate abundance range is hence one indicator of
bi-stability. [Lahti et al. 2014](http://www.nature.com/ncomms/2014/140708/ncomms5344/full/ncomms5344.html) used straightforward correlation analysis to quantify how the distance from the intermediate abundance region (50% quantile) is associated with the observed shifts between consecutive time points.
```{r bistability2, warning=FALSE, error=FALSE, message=FALSE}
# Here we use the (non-baseline) phyloseq object that contains time series.
intermediate.stability <- intermediate_stability(pseq, output = "scores")
```
## Bimodality quantification
Check the [bimodality page](Bimodality.html) for more examples on bimodality indicators.
Bimodality of the abundance distribution provides another (indirect)
indicator of bistability, although other explanations such as sampling
biases etc. should be controlled. Multiple bimodality scores are
available.
Multimodality score using [potential analysis with bootstrap](http://www.nature.com/ncomms/2014/140708/ncomms5344/full/ncomms5344.html)
```{r bimodality2, message=FALSE, warning=FALSE}
# Bimodality is better estimated from abundances at log scale (such as CLR)
pseq0.clr <- microbiome::transform(pseq0, "clr")
set.seed(4433)
# In practice, it is recommended to use more bootstrap iterations than in this example
bimodality.score <- bimodality(pseq0.clr, method = "potential_analysis",
bs.iter = 20, peak.threshold = 10,
min.density = 10)
```
## Comparing bimodality and intermediate stability
The analysis suggests that bimodal population distribution across individuals is often associated with instable intermediate abundances within individuals. The specific bi-stable groups in the upper left corner were suggested to constitute bistable tipping elements of the human intestinal microbiota in [Lahti et al. Nat. Comm. 5:4344, 2014](http://www.nature.com/ncomms/2014/140708/ncomms5344/full/ncomms5344.html):
```{r bimodalitybistability, message=FALSE, warning=FALSE}
taxa <- taxa(pseq0)
df <- data.frame(group = taxa,
intermediate.stability = intermediate.stability[taxa],
bimodality = bimodality.score[taxa])
theme_set(theme_bw(20))
library(ggrepel)
p <- ggplot(df,
aes(x = intermediate.stability, y = bimodality, label = '')) + #Doesn't add labels to points
geom_text_repel() +
geom_point()
#Labels only those that have bimodality > 0.4 and intermediate stability < 0, adjusts the placement of labels
p <- p + geom_text(aes(label = ifelse(df$bimodality > 0.6 | df$intermediate.stability < -0.25, group, '')), vjust = "inward", hjust = "inward")
print(p)
```
## Tipping point detection
Identify potential minima in cross-sectional population data as
tipping point candidates.
```{r stability-tipping, message=FALSE, warning=FALSE}
# Log10 abundance for a selected taxonomic group
# Pick the most bimodal taxa as an example
tax <- names(which.max(bimodality.score))
# Detect tipping points at log10 abundances
x <- abundances(microbiome::transform(pseq, "clr"))[tax,]
# Bootstrapped potential analysis to identify potential minima
# in practice, use more bootstrap iterations
potential.minima <- potential_analysis(x, bs.iter = 10)$minima
# Same with earlywarnings package (without bootstrap ie. less robust)
# library(earlywarnings)
# res <- livpotential_ews(x)$min.points
# Identify the potential minimum locations as tipping point candidates
tipping.samples <- sapply(potential.minima, function (m) {names(which.min(abs(sort(x) - m)))})
tipping.point <- abundances(pseq)[tax, tipping.samples]
print(tipping.point)
# Illustrate the detected tipping point
plot(density(x), main = tax)
abline(v = x[tipping.samples])
```