06-MAW-PV.Rmd

---
title: "OPEN & REPRODUCIBLE MICROBIOME DATA ANALYSIS SPRING SCHOOL 2018"
author: "Sudarshan"
date: "`r Sys.Date()`"
output: bookdown::gitbook
site: bookdown::bookdown_site
---

# Core microbiota  

For more information:

[The adult intestinal core microbiota is determined by analysis depth and health status](https://www.sciencedirect.com/science/article/pii/S1198743X14609629?via%3Dihub).  

[Intestinal microbiome landscaping: insight in community assemblage and implications for microbial modulation strategies](https://academic.oup.com/femsre/article/41/2/182/2979411).  

[Intestinal Microbiota in Healthy Adults: Temporal Analysis Reveals Individual and Common Core and Relation to Intestinal Symptoms](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0023035).  

```{r, warning=FALSE, message=FALSE}

library(microbiome) # data analysis and visualisation
library(phyloseq) # also the basis of data object. Data analysis and visualisation
library(RColorBrewer) # nice color options
library(ggpubr) # publication quality figures, based on ggplot2
library(dplyr) # data handling  

```

## Core microbiota anlaysis   

We will use the filtered phyloseq object from previous tutorial. We will use the filtered phyloseq object from the first section for pre-processioning.

```{r}

# read non rarefied data
ps1 <- readRDS("./phyobjects/ps.ng.tax.rds")

# use print option to see the data saved as phyloseq object.

```

Subset the data to keep only stool samples.  

```{r}

ps1.stool <- subset_samples(ps1, bodysite == "Stool")

# convert to relative abundance  
ps1.stool.rel <- microbiome::transform(ps1.stool, "compositional")
print(ps1.stool.rel)

ps1.stool.rel2 <- prune_taxa(taxa_sums(ps1.stool.rel) > 0, ps1.stool.rel)

print(ps1.stool.rel2)

```

Check for the core ASVs  

```{r}

core.taxa.standard <- core_members(ps1.stool.rel2, detection = 0.001, prevalence = 50/100)

print(core.taxa.standard)

```

There are 16 ASVs that are core based on the cut-offs for prevalence and detection we choose. However, we only see IDs, not very informative. We can get the classification of these as below.    

```{r}

# Extract the taxonomy table

taxonomy <- as.data.frame(tax_table(ps1.stool.rel2))

# Subset this taxonomy table to include only core OTUs  
core_taxa_id <- subset(taxonomy, rownames(taxonomy) %in% core.taxa.standard)

DT::datatable(core_taxa_id)

```


## Core abundance and diversity  
Total core abundance in each sample (sum of abundances of the core members):

```{r}

core.abundance <- sample_sums(core(ps1.stool.rel2, detection = 0.001, prevalence = 50/100))

DT::datatable(as.data.frame(core.abundance))

```


## Core visualization  

### Core heatmaps  

This visualization method has been used for instance in [Intestinal microbiome landscaping: insight in community assemblage and implications for microbial modulation strategies](https://academic.oup.com/femsre/article/41/2/182/2979411).  

Note that you can order the taxa on the heatmap with the order.taxa argument.

```{r}

# Core with compositionals:
prevalences <- seq(.05, 1, .05)
detections <- 10^seq(log10(1e-3), log10(.2), length = 10)

# Also define gray color palette
gray <- gray(seq(0,1,length=5))
p.core <- plot_core(ps1.stool.rel2, 
                    plot.type = "heatmap", 
                    colours = gray,
                    prevalences = prevalences, 
                    detections = detections, 
                    min.prevalence = .5) +
    xlab("Detection Threshold (Relative Abundance (%))")
print(p.core)    


# Same with the viridis color palette
# color-blind friendly and uniform
# options: viridis, magma, plasma, inferno
# https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
# Also discrete=TRUE versions available
library(viridis)
print(p.core + scale_fill_viridis())

```

Color change 

```{r}

# Core with compositionals:
prevalences <- seq(.05, 1, .05)
detections <- 10^seq(log10(1e-3), log10(.2), length = 10)

# Also define gray color palette

p.core <- plot_core(ps1.stool.rel2, 
                    plot.type = "heatmap", 
                    colours = rev(brewer.pal(5, "Spectral")),
                    prevalences = prevalences, 
                    detections = detections, 
                    min.prevalence = .5) + 
  xlab("Detection Threshold (Relative Abundance (%))")

print(p.core) 

```

Use the `format_to_besthit` function from microbiomeutilities to get the best classification of the ASVs.  

```{r}

ps1.stool.rel2.f <- microbiomeutilities::format_to_besthit(ps1.stool.rel2)

p.core <- plot_core(ps1.stool.rel2.f, 
                    plot.type = "heatmap", 
                    colours = rev(brewer.pal(5, "Spectral")),
                    prevalences = prevalences, 
                    detections = detections, 
                    min.prevalence = .5) + 
  xlab("Detection Threshold (Relative Abundance (%))")

p.core + theme(axis.text.y = element_text(face="italic"))

print(p.core)

```


```{r}

sessionInfo()

```