Ordination.Rmd

---
title: "Ordination analysis"
author: "Leo Lahti, Sudarshan Shetty et al."
bibliography: 
- bibliography.bib
output:
  BiocStyle::html_document:
    number_sections: no
    toc: yes
    toc_depth: 4
    toc_float: true
    self_contained: true
    thumbnails: true
    lightbox: true
    gallery: true
    use_bookdown: false
    highlight: haddock
---
<!--
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{microbiome tutorial - ordination}
  %\usepackage[utf8]{inputenc}
  %\VignetteEncoding{UTF-8}  
-->


Full examples for standard ordination techniques applied to phyloseq data, based on the [phyloseq ordination tutorial](http://joey711.github.io/phyloseq/plot_ordination-examples.html). For handy wrappers for some common ordination tasks in microbiome analysis, see [landscaping examples](Landscaping.html)


Load example data:

```{r ordination1, message=FALSE, warning=FALSE, eval=TRUE}
library(microbiome)
library(phyloseq)
library(ggplot2)
data(dietswap)
pseq <- dietswap

# Convert to compositional data
pseq.rel <- microbiome::transform(pseq, "compositional")

# Pick core taxa with with the given prevalence and detection limits
pseq.core <- core(pseq.rel, detection = .1/100, prevalence = 90/100)

# Use relative abundances for the core
pseq.core <- microbiome::transform(pseq.core, "compositional")
```


## Sample ordination

Project the samples with the given method and dissimilarity measure. 

```{r ordination2, message=FALSE, warning=FALSE, results="hide"}
# Ordinate the data
set.seed(4235421)
# proj <- get_ordination(pseq, "MDS", "bray")
ord <- ordinate(pseq, "MDS", "bray")
```


## Multidimensional scaling (MDS / PCoA)

```{r ordination-ordinate23, warning=FALSE, message=FALSE, fig.width=8, fig.height=6, fig.show="hold", out.width="400px"}
plot_ordination(pseq, ord, color = "nationality") +
                geom_point(size = 5)
```


## Canonical correspondence analysis (CCA)

```{r ordination-ordinate24a, warning=FALSE, message=FALSE, fig.width=8, fig.height=6, fig.show="hold", out.width="400px"}
# With samples
pseq.cca <- ordinate(pseq, "CCA")
p <- plot_ordination(pseq, pseq.cca,
       type = "samples", color = "nationality")
p <- p + geom_point(size = 4)
print(p)

# With taxa:
p <- plot_ordination(pseq, pseq.cca,
       type = "taxa", color = "Phylum")
p <- p + geom_point(size = 4)
print(p)
```


## Split plot

```{r ordination-ordinate25, warning=FALSE, message=FALSE, fig.width=14, fig.height=5}
plot_ordination(pseq, pseq.cca,
		      type = "split", shape = "nationality", 
    		      color = "Phylum", label = "nationality")
```


## t-SNE

t-SNE is a popular new ordination technique.

```{r tsne, warning=FALSE, message=FALSE, fig.width=14, fig.height=5}
library(vegan)
library(microbiome)
library(Rtsne) # Load package
set.seed(423542)

method <- "tsne"
trans <- "hellinger"
distance <- "euclidean"

# Distance matrix for samples
ps <- microbiome::transform(pseq, trans)

# Calculate sample similarities
dm <- vegdist(otu_table(ps), distance)

# Run TSNE
tsne_out <- Rtsne(dm, dims = 2) 
proj <- tsne_out$Y
rownames(proj) <- rownames(otu_table(ps))

library(ggplot2)
p <- plot_landscape(proj, legend = T, size = 1) 
print(p)
```