diff --git a/completed-notebooks/.gitkeep b/completed-notebooks/.gitkeep deleted file mode 100644 index e69de29..0000000 diff --git a/completed-notebooks/intro-to-R-tidyverse/01-intro_to_base_R.nb.html b/completed-notebooks/intro-to-R-tidyverse/01-intro_to_base_R.nb.html new file mode 100644 index 0000000..efd9579 --- /dev/null +++ b/completed-notebooks/intro-to-R-tidyverse/01-intro_to_base_R.nb.html @@ -0,0 +1,4031 @@ + + + + + + + + + + + + + + + +Introduction to R and RStudio + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Navigate the RStudio environment
    +
  • +
  • Use R for simple calculations, both mathematical and logical
    +
  • +
  • Define and use variables in base R
    +
  • +
  • Understand and apply base R functions
    +
  • +
  • Understand, define, and use R data types, including vector +manipulation and indexing
    +
  • +
  • Understand the anatomy of a data frame
  • +
+
+ +
+
+

What is R?

+

R is a statistical computing language that is +open source, meaning the underlying code for the language is +freely available to anyone. You do not need a special license or set of +permissions to use and develop code in R.

+

R itself is an interpreted computer language and comes with +functionality that comes bundled with the language itself, known as +“base R”. But there is also rich additional +functionality provided by external packages, or +libraries of code that assist in accomplishing certain tasks and can be +freely downloaded and loaded for use.

+

In the next notebook and subsequent modules, we will be using a suite +of packages collectively known as The Tidyverse. The +tidyverse is geared towards intuitive data science +applications that follow a shared data philosophy. But there are still +many core features of base R which are important to be aware of, and we +will be using concepts from both base R and the tidyverse in our +analyses, as well as task specific packages for analyses such as gene +expression.

+
+

What is RStudio?

+

RStudio is a graphical environment (“integrated development +environment” or IDE) for writing and developing R code. RStudio is NOT a +separate programming language - it is an interface we use to facilitate +R programming. In other words, you can program in R without RStudio, but +you can’t use the RStudio environment without R.

+

For more information about RStudio than you ever wanted to know, see +this RStudio +IDE Cheatsheet (pdf).

+
+
+
+

The RStudio Environment

+

The RStudio environment has four main panes, each of +which may have a number of tabs that display different information or +functionality. (their specific location can be changed under Tools -> +Global Options -> Pane Layout). RStudio Appearance

+
    +
  1. The Editor pane is where you can write R scripts +and other documents. Each tab here is its own document. This is your +text editor, which will allow you to save your R code for +future use. Note that change code here will not run automatically until +you run it.

  2. +
  3. The Console pane is where you can +interactively run R code.

  4. +
+
    +
  • There is also a Terminal tab here which can be used +for running programs outside R on your computer
  • +
+
    +
  1. The Environment pane primarily displays the +variables, sometimes known as objects that are defined during a +given R session, and what data or values they might hold.

  2. +
  3. The final pane, Files, Plots, Help, …, has +several pretty important tabs:

    +
      +
    • The Files tab shows the structure and contents of +files and folders (also known as directories) on your computer.
    • +
    • The Plots tab will reveal plots when you make +them
    • +
    • The Packages tab shows which installed packages +have been loaded into your R session
    • +
    • The Help tab will show the help page when you look +up a function
    • +
    • The Viewer tab will reveal compiled R Markdown +documents
    • +
  4. +
+
+
+

Basic Calculations

+
+

Mathematical operators

+

The most basic use of R is as a regular calculator:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OperationSymbol
Add+
Subtract-
Multiply*
Divide/
Exponentiate^ or **
+

For example, we can do some simple multiplication like this. When you +execute code within the notebook, the results appear beneath the code. +Try executing this chunk by clicking the Run button within the +chunk or by placing your cursor inside it and pressing +Cmd+Shift+Enter on a Mac, or Ctrl+Shift+Enter on a +PC.

+ + + +
5 * 6
+ + +
[1] 30
+ + + +

Use the console to calculate other expressions. Standard order of +operations applies (mostly), and you can use parentheses () +as you might expect (but not brackets [] or +braces{}, which have special meanings). Note however, that +you must always specify multiplication with +*; implicit multiplication such as 10(3 + 4) +or 10x will not work and will generate an error, or +worse.

+ + + +
10 * (3 + 4)^2
+ + +
[1] 490
+ + + +
+
+

Defining and using variables

+

To define a variable, we use the assignment operator which +looks like an arrow: <-, for example +x <- 7 takes the value on the right-hand side of the +operator and assigns it to the variable name on the left-hand side.

+ + + +
# Define a variable x to equal 7, and print out the value of x
+x <- 7
+
+# We can have R repeat back to us what `x` is by just using `x`
+x
+ + +
[1] 7
+ + + +

Some features of variables, considering the example +x <- 7: Every variable has a name, a +value, and a type. This variable’s +name is x, its value is 7, and its type is +numeric (7 is a number!). Re-defining a variable will +overwrite the value.

+ + + +
x <- 5.5
+
+x
+ + +
[1] 5.5
+ + + +

We can modify an existing variable by reassigning it to its same +name. Here we’ll add 2 to x and reassign the +result back to x.

+ + + +
x <- x + 2
+
+x
+ + +
[1] 7.5
+ + + +
+
+

Variable naming note:

+

As best you can, it is a good idea to make your variable names +informative (e.g. x doesn’t mean anything, but +sandwich_price is meaningful… if we’re talking about the +cost of sandwiches, that is..).

+
+
+

Comments

+

Arguably the most important aspect of your coding is +comments: Small pieces of explanatory text you leave in your code to +explain what the code is doing and/or leave notes to yourself or others. +Comments are invaluable for communicating your code to others, but they +are most important for Future You. Future You comes +into existence about one second after you write code, and has no idea +what on earth Past You was thinking.

+

Comments in R code are indicated with pound signs (aka +hashtags, octothorps). R will ignore any text in a line after +the pound sign, so you can put whatever text you like there.

+ + + +
22/7 # not quite pi
+ + +
[1] 3.142857
+ + +
# If we need a better approximation of pi, we can use Euler's formula
+# This uses atan(), which calculates arctangent.
+20 * atan(1/7) + 8 * atan(3/79) 
+ + +
[1] 3.141593
+ + + +

Help out Future You by adding lots of comments! Future You next week +thinks Today You is an idiot, and the only way you can convince Future +You that Today You is reasonably competent is by adding comments in your +code explaining why Today You is actually not so bad.

+
+
+
+

Functions

+

We can use pre-built computation methods called “functions” for other +operations. Functions have the following format, where the +argument is the information we are providing to the function +for it to run. An example of this was the atan() function +used above.

+
function_name(argument)
+

To learn about functions, we’ll examine one called log() +first.

+

To know what a function does and how to use it, use the question mark +which will reveal documentation in the help pane: +?log rhelp

+

The documentation tells us that log() is derived from +{base}, meaning it is a function that is part of base R. It +provides a brief description of what the function does and shows several +examples of to how use it.

+

In particular, the documentation tells us about what argument(s) to +provide:

+
    +
  • The first required argument is the value we’d like to take +the log of, by default its natural log
  • +
  • The second optional argument can specify a different base +rather than the default e.
  • +
+

Functions also return values for us to use. In the case of +log(), the returned value is the log’d value the function +computed.

+ + + +
log(73)
+ + +
[1] 4.290459
+ + + +

Here we can specify an argument of base to +calculate log base 3.

+ + + +
log(81, base = 3)
+ + +
[1] 4
+ + + +

If we don’t specify the argument names, it assumes they are +in the order that log defines them. See ?log +to see more about its arguments.

+ + + +
log(8, 2)
+ + +
[1] 3
+ + + +

We can switch the order if we specify the argument names.

+ + + +
log(base = 10, x = 4342)
+ + +
[1] 3.63769
+ + + +

We can also provide variables as arguments in the same way as the raw +values.

+ + + +
meaning <- 42
+log(meaning)
+ + +
[1] 3.73767
+ + + +
+
+

Working with variables

+
+

Variable Types

+

Variable types in R can sometimes be coerced (converted) +from one type to another.

+ + + +
# Define a variable with a number
+x <- 15
+ + + +

The function class() will tell us the variable’s +type.

+ + + +
class(x)
+ + +
[1] "numeric"
+ + +
numeric
+ + + +

Let’s coerce it to a character.

+ + + +
x <- as.character(x)
+class(x)
+ + +
[1] "character"
+ + +
character
+ + + +

See it now has quotes around it? It’s now a character and will behave +as such.

+ + + +
x
+ + +
[1] "15"
+ + +
15
+ + + +

Use this chunk to try to perform calculations with x, +now that it is a character, what happens?

+ + + +
# Try to perform calculations on `x`
+ + + +

But we can’t coerce everything:

+ + + +
# Let's create a character variable
+x <- "look at my character variable"
+ + + +

Let’s try making this a numeric variable:

+ + + +
x <- as.numeric(x)
+ + +
Warning: NAs introduced by coercion
+ + + +

Print out x.

+ + + +
x
+ + +
[1] NA
+ + + +

R is telling us it doesn’t know how to convert this to a numeric +variable, so it has returned NA instead.

+

For reference, here’s a summary of some of the most important +variable types.

+ ++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Variable TypeDefinitionExamplesCoercion
numericAny number value5
7.5
-1
as.numeric()
integerAny whole number value (no decimals)5
-100
as.integer()
characterAny collection of characters defined within quotation +marks. Also known as a “string”."a" (a single letter) +
"stringofletters" (a whole bunch of characters put +together as one)
"string of letters and spaces"
+"5"
'single quotes are also good'
as.character()
logicalA value of TRUE, FALSE, or +NATRUE
FALSE
NA (not +defined)
as.logical()
factorA special type of variable that denotes specific categories of a +categorical variable(stay tuned..)as.factor()
+
+
+

Vectors

+

You will have noticed that all your computations tend to pop up with +a [1] preceding them in R’s output. This is because, in +fact, all (ok mostly all) variables are by default vectors, and +our answers are the first (in these cases only) value in the vector. As +vectors get longer, new index indicators will appear at the start of new +lines.

+ + + +
# This is actually an vector that has one item in it.
+x <- 7
+ + + + + + +
# The length() functions tells us how long an vector is:
+length(x)
+ + +
[1] 1
+ + + +

We can define vectors with the function c(), which +stands for “combine”. This function takes a comma-separated set of +values to place in the vector, and returns the vector itself:

+ + + +
my_numeric_vector <- c(1, 1, 2, 3, 5, 8, 13, 21)
+my_numeric_vector
+ + +
[1]  1  1  2  3  5  8 13 21
+ + + +

We can build on vectors in place by redefining them:

+ + + +
# add the next two Fibonacci numbers to the series.
+my_numeric_vector <- c(my_numeric_vector, 34, 55)
+my_numeric_vector
+ + +
 [1]  1  1  2  3  5  8 13 21 34 55
+ + + +

We can pull out specific items from an vector using a process called +indexing, which uses brackets [] to specify the +position of an item.

+ + + +
# Grab the fourth value from my_numeric_vector
+# This gives us an vector of length 1 
+my_numeric_vector[4]
+ + +
[1] 3
+ + + +

Colons are also a nice way to quickly make ordered numeric vectors +Use a colon to specify an inclusive range of indices This will return an +vector with 2, 3, 4, and 5.

+ + + +
my_numeric_vector[2:5]
+ + +
[1] 1 2 3 5
+ + + +

One major benefit of vectors is the concept of +vectorization, where R by default performs operations +on the entire vector at once. For example, we can get the log +of all numbers 1-20 with a single, simple call, and more!

+ + + +
values_1_to_20 <- 1:20
+ + + + + + +
# calculate the log of values_1_to_20
+log(values_1_to_20)
+ + +
 [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
+ [8] 2.0794415 2.1972246 2.3025851 2.3978953 2.4849066 2.5649494 2.6390573
+[15] 2.7080502 2.7725887 2.8332133 2.8903718 2.9444390 2.9957323
+ + + +

Finally, we can apply logical expressions to vectors, just as we can +do for single values. The output here is a logical vector telling us +whether each value in example_vector is TRUE or FALSE

+ + + +
# Which values are <= 3?
+values_1_to_20 <= 3
+ + +
 [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+ + + +

There are several key functions which can be used on vectors +containing numeric values, some of which are below.

+
    +
  • mean(): The average value in the vector
  • +
  • min(): The minimum value in the vector
  • +
  • max(): The maximum value in the vector
  • +
  • sum(): The sum of all values in the vector
  • +
+

We can try out these functions on the vector +values_1_to_20 we’ve created.

+ + + +
mean(values_1_to_20)
+ + +
[1] 10.5
+ + +
# Try out some of the other functions we've listed above 
+ + + +
+
+

A note on variable naming

+

We have learned functions such as c, +length, sum, and etc. Imagine defining a +variable called c: This will work, but it will lead to a +lot of unintended bugs, so it’s best to avoid this.

+
+
+

The %in% logical operator

+

%in% is useful for determining whether a given item(s) +are in an vector.

+ + + +
# is `7` in our vector? 
+7 %in% values_1_to_20
+ + +
[1] TRUE
+ + + + + + +
# is `50` in our vector? 
+50 %in% values_1_to_20
+ + +
[1] FALSE
+ + + +

We can test a vector of values being within another vector of +values.

+ + + +
question_values <- c(1:3, 7, 50)
+# Are these values in our vector?
+question_values %in% values_1_to_20
+ + +
[1]  TRUE  TRUE  TRUE  TRUE FALSE
+ + + +
+
+
+

Data frames

+

Data frames are one of the most useful tools for data analysis in +R. They are tables which consist of rows and columns, much like a +spreadsheet. Each column is a variable which behaves as a +vector, and each row is an observation. We will begin our +exploration with dataset of measurements from three penguin species +measured, which we can find in the palmerpenguins +package. We’ll talk more about packages soon! To use this dataset, +we will load it from the palmerpenguins package using a +:: (more on this later) and assign it to a variable named +penguins in our current environment.

+ + + +
penguins <- palmerpenguins::penguins
+ + + +

drawings of penguin species Artwork by @allison_horst

+
+

Exploring data frames

+

The first step to using any data is to look at it!!! RStudio contains +a special function View() which allows you to literally +view a variable. You can also click on the object in the environment +pane to see its overall properties, or click the table icon on the +object’s row to automatically view the variable.

+

Some useful functions for exploring our data frame include:

+
    +
  • head() to see the first 6 rows of a data frame. +Additional arguments supplied can change the number of rows.
  • +
  • tail() to see the last 6 rows of a data frame. +Additional arguments supplied can change the number of rows.
  • +
  • names() to see the column names of the data frame.
  • +
  • nrow() to see how many rows are in the data frame
  • +
  • ncol() to see how many columns are in the data +frame.
  • +
+

We can additionally explore overall properties of the data +frame with two different functions: summary() and +str().

+

This provides summary statistics for each column:

+ + + +
summary(penguins)
+ + +
      species          island    bill_length_mm  bill_depth_mm  
+ Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
+ Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
+ Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
+                                 Mean   :43.92   Mean   :17.15  
+                                 3rd Qu.:48.50   3rd Qu.:18.70  
+                                 Max.   :59.60   Max.   :21.50  
+                                 NA's   :2       NA's   :2      
+ flipper_length_mm  body_mass_g       sex           year     
+ Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
+ 1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
+ Median :197.0     Median :4050   NA's  : 11   Median :2008  
+ Mean   :200.9     Mean   :4202                Mean   :2008  
+ 3rd Qu.:213.0     3rd Qu.:4750                3rd Qu.:2009  
+ Max.   :231.0     Max.   :6300                Max.   :2009  
+ NA's   :2         NA's   :2                                 
+ + + +

This provides a short view of the structure and +contents of the data frame.

+ + + +
str(penguins)
+ + +
tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
+ $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
+ $ island           : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
+ $ bill_length_mm   : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
+ $ bill_depth_mm    : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
+ $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
+ $ body_mass_g      : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
+ $ sex              : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
+ $ year             : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
+ + + +

You’ll notice that the column species is a +factor: This is a special type of character variable that +represents distinct categories known as “levels”. We have learned here +that there are three levels in the species column: Adelie, +Chinstrap, and Gentoo. We might want to explore individual columns of +the data frame more in-depth. We can examine individual columns using +the dollar sign $ to select one by name:

+ + + +
# Extract bill_length_mm as a vector
+penguins$bill_length_mm
+ + +
  [1] 39.1 39.5 40.3   NA 36.7 39.3 38.9 39.2 34.1 42.0 37.8 37.8 41.1 38.6 34.6
+ [16] 36.6 38.7 42.5 34.4 46.0 37.8 37.7 35.9 38.2 38.8 35.3 40.6 40.5 37.9 40.5
+ [31] 39.5 37.2 39.5 40.9 36.4 39.2 38.8 42.2 37.6 39.8 36.5 40.8 36.0 44.1 37.0
+ [46] 39.6 41.1 37.5 36.0 42.3 39.6 40.1 35.0 42.0 34.5 41.4 39.0 40.6 36.5 37.6
+ [61] 35.7 41.3 37.6 41.1 36.4 41.6 35.5 41.1 35.9 41.8 33.5 39.7 39.6 45.8 35.5
+ [76] 42.8 40.9 37.2 36.2 42.1 34.6 42.9 36.7 35.1 37.3 41.3 36.3 36.9 38.3 38.9
+ [91] 35.7 41.1 34.0 39.6 36.2 40.8 38.1 40.3 33.1 43.2
+ [ reached getOption("max.print") -- omitted 244 entries ]
+ + +
# indexing operators can be used on these vectors too
+penguins$bill_length_mm[1:10]
+ + +
 [1] 39.1 39.5 40.3   NA 36.7 39.3 38.9 39.2 34.1 42.0
+ + + +

We can perform our regular vector operations on columns directly.

+ + + +
# calculate the mean of the bill_length_mm column
+mean(penguins$bill_length_mm,
+     na.rm = TRUE) # remove missing values before calculating the mean
+ + +
[1] 43.92193
+ + + +

We can also calculate the full summary statistics for a single column +directly.

+ + + +
# show a summary of the bill_length_mm column
+summary(penguins$bill_length_mm)
+ + +
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
+  32.10   39.23   44.45   43.92   48.50   59.60       2 
+ + + +

Extract species as a vector and subset it to see a +preview.

+ + + +
# get the first 10 values of the species column
+penguins$species[1:10]
+ + +
 [1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
+Levels: Adelie Chinstrap Gentoo
+ + + +

And view its levels with the levels() +function.

+ + + +
levels(penguins$species)
+ + +
[1] "Adelie"    "Chinstrap" "Gentoo"   
+ + +
Adelie
+Chinstrap
+Gentoo
+ + + +
+
+
+

Files and directories

+

In many situations, we will be reading in tabular data from a file +and using it as a data frame. To practice, we will read in a file we +will be using in the next notebook as well, +gene_results_GSE44971.tsv, in the data folder. +File paths are relative to the location where this notebook file (.Rmd) +is saved.

+

Here we will use a function, read_tsv() from the +readr package. Before we are able to use the function, we +have to load the package using library().

+ + + +
library(readr)
+ + + +

file.path() creates a properly formatted file path by +adding a path separator (/ on Mac and Linux operating +systems, the latter of which is the operating system that our RStudio +Server runs on) between separate folders or directories. Because file +path separators can differ between your computer and the computer of +someone who wants to use your code, we use file.path() +instead of typing out "data/gene_results_GSE44971.tsv". +Each argument to file.path() is a directory or +file name. You’ll notice each argument is in quotes, we specify +data first because the file, +gene_results_GSE44971.tsv is in the data +folder.

+ + + +
file.path("data", "gene_results_GSE44971.tsv")
+ + +
[1] "data/gene_results_GSE44971.tsv"
+ + +
data/gene_results_GSE44971.tsv
+ + + +

As you can see above, the result of running file.path() +is that it creates a string with an accurately-formatted path +for your file system. This string can be used moving forward when you +need to refer to the path to your file. Let’s go ahead and store this +file path as a variable in our environment.

+ + + +
gene_file_path <- file.path("data", "gene_results_GSE44971.tsv")
+ + + +

Now we are ready to use read_tsv() to read the file into +R. The resulting data frame will be stored in a variable named +stats_df. Note the <- (assignment +operator!) is responsible for saving this to our global environment.

+ + + +
# read in the file `gene_results_GSE44971.tsv` from the data directory
+stats_df <- read_tsv(gene_file_path)
+ + +
Rows: 6804 Columns: 8
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr (3): ensembl_id, gene_symbol, contrast
+dbl (5): log_fold_change, avg_expression, t_statistic, p_value, adj_p_value
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

Take a look at your environment panel to see what +stats_df looks like. We can also print out a preview of the +stats_df data frame here.

+ + + +
# display stats_df
+stats_df
+ +
+ +
+ + +
+

Session Info

+

At the end of every notebook, you will see us print out +sessionInfo. This aids in the reproducibility of your code +by showing exactly what packages and versions were being used the last +time the notebook was run.

+ + + +
sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats     graphics  grDevices utils     datasets  methods   base     
+
+other attached packages:
+[1] readr_2.1.5    optparse_1.7.5
+
+loaded via a namespace (and not attached):
+ [1] crayon_1.5.2         vctrs_0.6.5          cli_3.6.2           
+ [4] knitr_1.46           rlang_1.1.3          xfun_0.43           
+ [7] stringi_1.8.3        jsonlite_1.8.8       bit_4.0.5           
+[10] glue_1.7.0           htmltools_0.5.8.1    sass_0.4.9          
+[13] hms_1.1.3            fansi_1.0.6          rmarkdown_2.26      
+[16] evaluate_0.23        jquerylib_0.1.4      tibble_3.2.1        
+[19] tzdb_0.4.0           fastmap_1.1.1        yaml_2.3.8          
+[22] lifecycle_1.0.4      palmerpenguins_0.1.1 stringr_1.5.1       
+[25] compiler_4.4.0       getopt_1.20.4        pkgconfig_2.0.3     
+[28] digest_0.6.35        R6_2.5.1             tidyselect_1.2.1    
+[31] utf8_1.2.4           parallel_4.4.0       vroom_1.6.5         
+[34] pillar_1.9.0         magrittr_2.0.3       bslib_0.7.0         
+[37] bit64_4.0.5          tools_4.4.0          cachem_1.0.8        
+ + +
+
+ +
LS0tCnRpdGxlOiAiSW50cm9kdWN0aW9uIHRvIFIgYW5kIFJTdHVkaW8iCmF1dGhvcjogT3JpZ2luYWxseSBhdXRob3JlZCBieSBTdGVwaGFuaWUgSi4gU3BpZWxtYW4sPGJyPmFkYXB0ZWQgYnkgQ0NETCBmb3IgQUxTRgpkYXRlOiAyMDIxCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6IHRydWUKLS0tCgojIyBPYmplY3RpdmVzCgpUaGlzIG5vdGVib29rIHdpbGwgZGVtb25zdHJhdGUgaG93IHRvOiAgCgotIE5hdmlnYXRlIHRoZSBSU3R1ZGlvIGVudmlyb25tZW50ICAKLSBVc2UgUiBmb3Igc2ltcGxlIGNhbGN1bGF0aW9ucywgYm90aCBtYXRoZW1hdGljYWwgYW5kIGxvZ2ljYWwgIAotIERlZmluZSBhbmQgdXNlIHZhcmlhYmxlcyBpbiBiYXNlIFIgIAotIFVuZGVyc3RhbmQgYW5kIGFwcGx5IGJhc2UgUiBmdW5jdGlvbnMgICAKLSBVbmRlcnN0YW5kLCBkZWZpbmUsIGFuZCB1c2UgUiBkYXRhIHR5cGVzLCBpbmNsdWRpbmcgdmVjdG9yIG1hbmlwdWxhdGlvbiBhbmQgaW5kZXhpbmcgIAotIFVuZGVyc3RhbmQgdGhlIGFuYXRvbXkgb2YgYSBkYXRhIGZyYW1lICAKCi0tLQoKIyMjIyAqTW9yZSByZXNvdXJjZXMgZm9yIGxlYXJuaW5nIFIqIAoKLSBbU3dpcmwsIGFuIGludGVyYWN0aXZlIHR1dG9yaWFsXShodHRwczovL3N3aXJsc3RhdHMuY29tLykgIAotIFtfUiBmb3IgRGF0YSBTY2llbmNlXyBib29rXShodHRwczovL3I0ZHMuaGFkLmNvLm56LykgIAotIFtUdXRvcmlhbCBvbiBSLCBSU3R1ZGlvIGFuZCBSIE1hcmtkb3duXShodHRwczovL2lzbWF5Yy5naXRodWIuaW8vcmJhc2ljcy1ib29rLykgIAotIFtIYW5keSBSIGNoZWF0c2hlZXRzXShodHRwczovL3d3dy5wb3NpdC5jby9yZXNvdXJjZXMvY2hlYXRzaGVldHMvKSAgCi0gW1IgTWFya2Rvd24gd2Vic2l0ZV0oaHR0cHM6Ly9ybWFya2Rvd24ucnN0dWRpby5jb20pICAKLSBbX1IgTWFya2Rvd246IFRoZSBEZWZpbml0aXZlIEd1aWRlX10oaHR0cHM6Ly9ib29rZG93bi5vcmcveWlodWkvcm1hcmtkb3duLykgIAoKIyMgV2hhdCBpcyBSPwoKKipSKiogaXMgYSBzdGF0aXN0aWNhbCBjb21wdXRpbmcgbGFuZ3VhZ2UgdGhhdCBpcyBfb3BlbiBzb3VyY2VfLCBtZWFuaW5nIHRoZSB1bmRlcmx5aW5nIGNvZGUgZm9yIHRoZSBsYW5ndWFnZSBpcyBmcmVlbHkgYXZhaWxhYmxlIHRvIGFueW9uZS4gCllvdSBkbyBub3QgbmVlZCBhIHNwZWNpYWwgbGljZW5zZSBvciBzZXQgb2YgcGVybWlzc2lvbnMgdG8gdXNlIGFuZCBkZXZlbG9wIGNvZGUgaW4gUi4gCgpSIGl0c2VsZiBpcyBhbiBfaW50ZXJwcmV0ZWQgY29tcHV0ZXIgbGFuZ3VhZ2VfIGFuZCBjb21lcyB3aXRoIGZ1bmN0aW9uYWxpdHkgdGhhdCBjb21lcyBidW5kbGVkIHdpdGggdGhlIGxhbmd1YWdlIGl0c2VsZiwga25vd24gYXMgKioiYmFzZSBSIioqLgpCdXQgdGhlcmUgaXMgYWxzbyByaWNoIGFkZGl0aW9uYWwgZnVuY3Rpb25hbGl0eSBwcm92aWRlZCBieSAqKmV4dGVybmFsIHBhY2thZ2VzKiosIG9yIGxpYnJhcmllcyBvZiBjb2RlIHRoYXQgYXNzaXN0IGluIGFjY29tcGxpc2hpbmcgY2VydGFpbiB0YXNrcyBhbmQgY2FuIGJlIGZyZWVseSBkb3dubG9hZGVkIGFuZCBsb2FkZWQgZm9yIHVzZS4gCgpJbiB0aGUgbmV4dCBub3RlYm9vayBhbmQgc3Vic2VxdWVudCBtb2R1bGVzLCB3ZSB3aWxsIGJlIHVzaW5nIGEgc3VpdGUgb2YgcGFja2FnZXMgY29sbGVjdGl2ZWx5IGtub3duIGFzIFsqKlRoZSBUaWR5dmVyc2UqKl0oaHR0cHM6Ly90aWR5dmVyc2Uub3JnKS4gClRoZSBgdGlkeXZlcnNlYCBpcyBnZWFyZWQgdG93YXJkcyBpbnR1aXRpdmUgZGF0YSBzY2llbmNlIGFwcGxpY2F0aW9ucyB0aGF0IGZvbGxvdyBhIHNoYXJlZCBkYXRhIHBoaWxvc29waHkuCkJ1dCB0aGVyZSBhcmUgc3RpbGwgbWFueSBjb3JlIGZlYXR1cmVzIG9mIGJhc2UgUiB3aGljaCBhcmUgaW1wb3J0YW50IHRvIGJlIGF3YXJlIG9mLCBhbmQgd2Ugd2lsbCBiZSB1c2luZyBjb25jZXB0cyBmcm9tIGJvdGggYmFzZSBSIGFuZCB0aGUgdGlkeXZlcnNlIGluIG91ciBhbmFseXNlcywgYXMgd2VsbCBhcyB0YXNrIHNwZWNpZmljIHBhY2thZ2VzIGZvciBhbmFseXNlcyBzdWNoIGFzIGdlbmUgZXhwcmVzc2lvbi4gCgojIyMgV2hhdCBpcyBSU3R1ZGlvPwoKUlN0dWRpbyBpcyBhIF9ncmFwaGljYWwgZW52aXJvbm1lbnRfICgiaW50ZWdyYXRlZCBkZXZlbG9wbWVudCBlbnZpcm9ubWVudCIgb3IgSURFKSBmb3Igd3JpdGluZyBhbmQgZGV2ZWxvcGluZyBSIGNvZGUuIFJTdHVkaW8gaXMgTk9UIGEgc2VwYXJhdGUgcHJvZ3JhbW1pbmcgbGFuZ3VhZ2UgLSBpdCBpcyBhbiBpbnRlcmZhY2Ugd2UgdXNlIHRvIGZhY2lsaXRhdGUgUiBwcm9ncmFtbWluZy4gCkluIG90aGVyIHdvcmRzLCB5b3UgY2FuIHByb2dyYW0gaW4gUiB3aXRob3V0IFJTdHVkaW8sIGJ1dCB5b3UgY2FuJ3QgdXNlIHRoZSBSU3R1ZGlvIGVudmlyb25tZW50IHdpdGhvdXQgUi4KCkZvciBtb3JlIGluZm9ybWF0aW9uIGFib3V0IFJTdHVkaW8gdGhhbiB5b3UgZXZlciB3YW50ZWQgdG8ga25vdywgc2VlIHRoaXMgW1JTdHVkaW8gSURFIENoZWF0c2hlZXQgKHBkZildKGh0dHBzOi8vZ2l0aHViLmNvbS9yc3R1ZGlvL2NoZWF0c2hlZXRzL3Jhdy9tYWluL3JzdHVkaW8taWRlLnBkZikuCgojIyBUaGUgUlN0dWRpbyBFbnZpcm9ubWVudAoKVGhlIFJTdHVkaW8gZW52aXJvbm1lbnQgaGFzIGZvdXIgbWFpbiAqKnBhbmVzKiosIGVhY2ggb2Ygd2hpY2ggbWF5IGhhdmUgYSBudW1iZXIgb2YgdGFicyB0aGF0IGRpc3BsYXkgZGlmZmVyZW50IGluZm9ybWF0aW9uIG9yIGZ1bmN0aW9uYWxpdHkuICh0aGVpciBzcGVjaWZpYyBsb2NhdGlvbiBjYW4gYmUgY2hhbmdlZCB1bmRlciBUb29scyAtPiBHbG9iYWwgT3B0aW9ucyAtPiBQYW5lIExheW91dCkuCiFbUlN0dWRpbyBBcHBlYXJhbmNlXShzY3JlZW5zaG90cy9yc3R1ZGlvLXBhbmVzLnBuZykgCgoxLiBUaGUgKipFZGl0b3IqKiBwYW5lIGlzIHdoZXJlIHlvdSBjYW4gd3JpdGUgUiBzY3JpcHRzIGFuZCBvdGhlciBkb2N1bWVudHMuIEVhY2ggdGFiIGhlcmUgaXMgaXRzIG93biBkb2N1bWVudC4KVGhpcyBpcyB5b3VyIF90ZXh0IGVkaXRvcl8sIHdoaWNoIHdpbGwgYWxsb3cgeW91IHRvIHNhdmUgeW91ciBSIGNvZGUgZm9yIGZ1dHVyZSB1c2UuIApOb3RlIHRoYXQgY2hhbmdlIGNvZGUgaGVyZSB3aWxsIG5vdCBydW4gYXV0b21hdGljYWxseSB1bnRpbCB5b3UgcnVuIGl0LiAKCjIuIFRoZSAqKkNvbnNvbGUqKiBwYW5lIGlzIHdoZXJlIHlvdSBjYW4gX2ludGVyYWN0aXZlbHlfIHJ1biBSIGNvZGUuIAogICsgVGhlcmUgaXMgYWxzbyBhICoqVGVybWluYWwqKiB0YWIgaGVyZSB3aGljaCBjYW4gYmUgdXNlZCBmb3IgcnVubmluZyBwcm9ncmFtcyBvdXRzaWRlIFIgb24geW91ciBjb21wdXRlcgogIAozLiBUaGUgKipFbnZpcm9ubWVudCoqIHBhbmUgcHJpbWFyaWx5IGRpc3BsYXlzIHRoZSB2YXJpYWJsZXMsIHNvbWV0aW1lcyBrbm93biBhcyBfb2JqZWN0c18gdGhhdCBhcmUgZGVmaW5lZCBkdXJpbmcgYSBnaXZlbiBSIHNlc3Npb24sIGFuZCB3aGF0IGRhdGEgb3IgdmFsdWVzIHRoZXkgbWlnaHQgaG9sZC4KCjQuIFRoZSBmaW5hbCBwYW5lLCAqKkZpbGVzLCBQbG90cywgSGVscCwgLi4uKiosIGhhcyBzZXZlcmFsIHByZXR0eSBpbXBvcnRhbnQgdGFiczoKICAgICsgVGhlICoqRmlsZXMqKiB0YWIgc2hvd3MgdGhlIHN0cnVjdHVyZSBhbmQgY29udGVudHMgb2YgZmlsZXMgYW5kIGZvbGRlcnMgKGFsc28ga25vd24gYXMgZGlyZWN0b3JpZXMpIG9uIHlvdXIgY29tcHV0ZXIuCiAgICArIFRoZSAqKlBsb3RzKiogdGFiIHdpbGwgcmV2ZWFsIHBsb3RzIHdoZW4geW91IG1ha2UgdGhlbQogICAgKyBUaGUgKipQYWNrYWdlcyoqIHRhYiBzaG93cyB3aGljaCBpbnN0YWxsZWQgcGFja2FnZXMgaGF2ZSBiZWVuIGxvYWRlZCBpbnRvIHlvdXIgUiBzZXNzaW9uCiAgICArIFRoZSAqKkhlbHAqKiB0YWIgd2lsbCBzaG93IHRoZSBoZWxwIHBhZ2Ugd2hlbiB5b3UgbG9vayB1cCBhIGZ1bmN0aW9uCiAgICArIFRoZSAqKlZpZXdlcioqIHRhYiB3aWxsIHJldmVhbCBjb21waWxlZCBSIE1hcmtkb3duIGRvY3VtZW50cwoKIyMgQmFzaWMgQ2FsY3VsYXRpb25zCgojIyMgTWF0aGVtYXRpY2FsIG9wZXJhdG9ycwoKVGhlIG1vc3QgYmFzaWMgdXNlIG9mIFIgaXMgYXMgYSByZWd1bGFyIGNhbGN1bGF0b3I6Cgp8IE9wZXJhdGlvbiB8IFN5bWJvbCB8CnwtLS0tLS0tLS0tLXwtLS0tLS0tLXwKfCBBZGQgIHwgYCtgIHwgCnwgU3VidHJhY3QgIHwgYC1gIHwgCnwgTXVsdGlwbHkgIHwgYCpgIHwgCnwgRGl2aWRlICB8IGAvYCB8IAp8IEV4cG9uZW50aWF0ZSB8IGBeYCBvciBgKipgIHwgCgpGb3IgZXhhbXBsZSwgd2UgY2FuIGRvIHNvbWUgc2ltcGxlIG11bHRpcGxpY2F0aW9uIGxpa2UgdGhpcy4gCldoZW4geW91IGV4ZWN1dGUgY29kZSB3aXRoaW4gdGhlIG5vdGVib29rLCB0aGUgcmVzdWx0cyBhcHBlYXIgYmVuZWF0aCB0aGUgY29kZS4gClRyeSBleGVjdXRpbmcgdGhpcyBjaHVuayBieSBjbGlja2luZyB0aGUgKlJ1biogYnV0dG9uIHdpdGhpbiB0aGUgY2h1bmsgb3IgYnkgCnBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ21kK1NoaWZ0K0VudGVyKiBvbiBhIE1hYywgb3IgKkN0cmwrU2hpZnQrRW50ZXIqIG9uIGEgUEMuCgpgYGB7ciBjYWxjdWxhdG9yfQo1ICogNgpgYGAKClVzZSB0aGUgY29uc29sZSB0byBjYWxjdWxhdGUgb3RoZXIgZXhwcmVzc2lvbnMuIFN0YW5kYXJkIG9yZGVyIG9mIG9wZXJhdGlvbnMgYXBwbGllcyAobW9zdGx5KSwgYW5kICB5b3UgY2FuIHVzZSBwYXJlbnRoZXNlcyBgKClgIGFzIHlvdSBtaWdodCBleHBlY3QgKGJ1dCBub3QgYnJhY2tldHMgYFtdYCBvciBicmFjZXNge31gLCB3aGljaCBoYXZlIHNwZWNpYWwgbWVhbmluZ3MpLiBOb3RlIGhvd2V2ZXIsIHRoYXQgeW91IG11c3QgKiphbHdheXMqKiBzcGVjaWZ5IG11bHRpcGxpY2F0aW9uIHdpdGggYCpgOyBpbXBsaWNpdCBtdWx0aXBsaWNhdGlvbiBzdWNoIGFzIGAxMCgzICsgNClgIG9yIGAxMHhgIHdpbGwgbm90IHdvcmsgYW5kIHdpbGwgZ2VuZXJhdGUgYW4gZXJyb3IsIG9yIHdvcnNlLgoKYGBge3IgZXhwcmVzc2lvbnMsIGxpdmUgPSBUUlVFfQoxMCAqICgzICsgNCleMgpgYGAKCgojIyMgRGVmaW5pbmcgYW5kIHVzaW5nIHZhcmlhYmxlcyAKClRvIGRlZmluZSBhIHZhcmlhYmxlLCB3ZSB1c2UgdGhlIF9hc3NpZ25tZW50IG9wZXJhdG9yXyB3aGljaCBsb29rcyBsaWtlIGFuIGFycm93OiBgPC1gLCBmb3IgZXhhbXBsZSBgeCA8LSA3YCB0YWtlcyB0aGUgdmFsdWUgb24gdGhlIHJpZ2h0LWhhbmQgc2lkZSBvZiB0aGUgb3BlcmF0b3IgYW5kIGFzc2lnbnMgaXQgdG8gdGhlIHZhcmlhYmxlIG5hbWUgb24gdGhlIGxlZnQtaGFuZCBzaWRlLiAKCmBgYHtyIHZhci1kZWZpbmUsIGxpdmUgPSBUUlVFfQojIERlZmluZSBhIHZhcmlhYmxlIHggdG8gZXF1YWwgNywgYW5kIHByaW50IG91dCB0aGUgdmFsdWUgb2YgeAp4IDwtIDcKCiMgV2UgY2FuIGhhdmUgUiByZXBlYXQgYmFjayB0byB1cyB3aGF0IGB4YCBpcyBieSBqdXN0IHVzaW5nIGB4YAp4CmBgYAoKU29tZSBmZWF0dXJlcyBvZiB2YXJpYWJsZXMsIGNvbnNpZGVyaW5nIHRoZSBleGFtcGxlIGB4IDwtIDdgOgpFdmVyeSB2YXJpYWJsZSBoYXMgYSAqKm5hbWUqKiwgYSAqKnZhbHVlKiosIGFuZCBhICoqdHlwZSoqLiAKVGhpcyB2YXJpYWJsZSdzIG5hbWUgaXMgYHhgLCBpdHMgdmFsdWUgaXMgYDdgLCBhbmQgaXRzIHR5cGUgaXMgYG51bWVyaWNgICg3IGlzIGEgbnVtYmVyISkuClJlLWRlZmluaW5nIGEgdmFyaWFibGUgd2lsbCBvdmVyd3JpdGUgdGhlIHZhbHVlLgoKYGBge3IgdmFyLXJlZGVmaW5lfQp4IDwtIDUuNQoKeApgYGAKCldlIGNhbiBtb2RpZnkgYW4gZXhpc3RpbmcgdmFyaWFibGUgYnkgcmVhc3NpZ25pbmcgaXQgdG8gaXRzIHNhbWUgbmFtZS4gCkhlcmUgd2UnbGwgYWRkIGAyYCB0byBgeGAgYW5kIHJlYXNzaWduIHRoZSByZXN1bHQgYmFjayB0byBgeGAuIAoKYGBge3IgdmFyLW1vZGlmeSwgbGl2ZSA9IFRSVUV9CnggPC0geCArIDIKCngKYGBgCgojIyMgVmFyaWFibGUgbmFtaW5nIG5vdGU6CkFzIGJlc3QgeW91IGNhbiwgaXQgaXMgYSBnb29kIGlkZWEgdG8gbWFrZSB5b3VyIHZhcmlhYmxlIG5hbWVzIGluZm9ybWF0aXZlIChlLmcuIGB4YCBkb2Vzbid0IG1lYW4gYW55dGhpbmcsIGJ1dCBgc2FuZHdpY2hfcHJpY2VgIGlzIG1lYW5pbmdmdWwuLi4gaWYgd2UncmUgdGFsa2luZyBhYm91dCB0aGUgY29zdCBvZiBzYW5kd2ljaGVzLCB0aGF0IGlzLi4pLiAKCiMjIyBDb21tZW50cwoKQXJndWFibHkgdGhlIF9fbW9zdCBpbXBvcnRhbnRfXyBhc3BlY3Qgb2YgeW91ciBjb2RpbmcgaXMgY29tbWVudHM6IFNtYWxsIHBpZWNlcyBvZiBleHBsYW5hdG9yeSB0ZXh0IHlvdSBsZWF2ZSBpbiB5b3VyIGNvZGUgdG8gZXhwbGFpbiB3aGF0IHRoZSBjb2RlIGlzIGRvaW5nIGFuZC9vciBsZWF2ZSBub3RlcyB0byB5b3Vyc2VsZiBvciBvdGhlcnMuIApDb21tZW50cyBhcmUgaW52YWx1YWJsZSBmb3IgY29tbXVuaWNhdGluZyB5b3VyIGNvZGUgdG8gb3RoZXJzLCBidXQgdGhleSBhcmUgbW9zdCBpbXBvcnRhbnQgZm9yICoqRnV0dXJlIFlvdSoqLiAKRnV0dXJlIFlvdSBjb21lcyBpbnRvIGV4aXN0ZW5jZSBhYm91dCBvbmUgc2Vjb25kIGFmdGVyIHlvdSB3cml0ZSBjb2RlLCBhbmQgaGFzIG5vIGlkZWEgd2hhdCBvbiBlYXJ0aCBQYXN0IFlvdSB3YXMgdGhpbmtpbmcuIAoKQ29tbWVudHMgaW4gUiBjb2RlIGFyZSBpbmRpY2F0ZWQgd2l0aCBwb3VuZCBzaWducyAoKmFrYSogaGFzaHRhZ3MsIG9jdG90aG9ycHMpLiBSIHdpbGwgX2lnbm9yZV8gYW55IHRleHQgaW4gYSBsaW5lIGFmdGVyIHRoZSBwb3VuZCBzaWduLCBzbyB5b3UgY2FuIHB1dCB3aGF0ZXZlciB0ZXh0IHlvdSBsaWtlIHRoZXJlLgoKYGBge3IgY29tbWVudHN9CjIyLzcgIyBub3QgcXVpdGUgcGkKCiMgSWYgd2UgbmVlZCBhIGJldHRlciBhcHByb3hpbWF0aW9uIG9mIHBpLCB3ZSBjYW4gdXNlIEV1bGVyJ3MgZm9ybXVsYQojIFRoaXMgdXNlcyBhdGFuKCksIHdoaWNoIGNhbGN1bGF0ZXMgYXJjdGFuZ2VudC4KMjAgKiBhdGFuKDEvNykgKyA4ICogYXRhbigzLzc5KSAKYGBgCgpIZWxwIG91dCBGdXR1cmUgWW91IGJ5IGFkZGluZyBsb3RzIG9mIGNvbW1lbnRzISAKRnV0dXJlIFlvdSBuZXh0IHdlZWsgdGhpbmtzIFRvZGF5IFlvdSBpcyBhbiBpZGlvdCwgYW5kIHRoZSBvbmx5IHdheSB5b3UgY2FuIGNvbnZpbmNlIEZ1dHVyZSBZb3UgdGhhdCBUb2RheSBZb3UgaXMgcmVhc29uYWJseSBjb21wZXRlbnQgaXMgYnkgYWRkaW5nIGNvbW1lbnRzIGluIHlvdXIgY29kZSBleHBsYWluaW5nIHdoeSBUb2RheSBZb3UgaXMgYWN0dWFsbHkgbm90IHNvIGJhZC4KCiMjIEZ1bmN0aW9ucwpXZSBjYW4gdXNlIHByZS1idWlsdCBjb21wdXRhdGlvbiBtZXRob2RzIGNhbGxlZCAiZnVuY3Rpb25zIiBmb3Igb3RoZXIgb3BlcmF0aW9ucy4gCkZ1bmN0aW9ucyBoYXZlIHRoZSBmb2xsb3dpbmcgZm9ybWF0LCB3aGVyZSB0aGUgX2FyZ3VtZW50XyBpcyB0aGUgaW5mb3JtYXRpb24gd2UgYXJlIHByb3ZpZGluZyB0byB0aGUgZnVuY3Rpb24gZm9yIGl0IHRvIHJ1bi4gCkFuIGV4YW1wbGUgb2YgdGhpcyB3YXMgdGhlIGBhdGFuKClgIGZ1bmN0aW9uIHVzZWQgYWJvdmUuCgpgYGByCmZ1bmN0aW9uX25hbWUoYXJndW1lbnQpCmBgYAoKVG8gbGVhcm4gYWJvdXQgZnVuY3Rpb25zLCB3ZSdsbCBleGFtaW5lIG9uZSBjYWxsZWQgYGxvZygpYCBmaXJzdC4gCgpUbyBrbm93IHdoYXQgYSBmdW5jdGlvbiBkb2VzIGFuZCBob3cgdG8gdXNlIGl0LCB1c2UgdGhlIHF1ZXN0aW9uIG1hcmsgd2hpY2ggd2lsbCByZXZlYWwgZG9jdW1lbnRhdGlvbiBpbiB0aGUgKipoZWxwIHBhbmUqKjogYD9sb2dgCiFbcmhlbHBdKHNjcmVlbnNob3RzL3JoZWxwLWxvZy5wbmcpIAoKVGhlIGRvY3VtZW50YXRpb24gdGVsbHMgdXMgdGhhdCBgbG9nKClgIGlzIGRlcml2ZWQgZnJvbSBge2Jhc2V9YCwgbWVhbmluZyBpdCBpcyBhIGZ1bmN0aW9uIHRoYXQgaXMgcGFydCBvZiBiYXNlIFIuIApJdCBwcm92aWRlcyBhIGJyaWVmIGRlc2NyaXB0aW9uIG9mIHdoYXQgdGhlIGZ1bmN0aW9uIGRvZXMgYW5kIHNob3dzIHNldmVyYWwgZXhhbXBsZXMgb2YgdG8gaG93IHVzZSBpdC4KCkluIHBhcnRpY3VsYXIsIHRoZSBkb2N1bWVudGF0aW9uIHRlbGxzIHVzIGFib3V0IHdoYXQgYXJndW1lbnQocykgdG8gcHJvdmlkZToKCisgVGhlIGZpcnN0IF9yZXF1aXJlZF8gYXJndW1lbnQgaXMgdGhlIHZhbHVlIHdlJ2QgbGlrZSB0byB0YWtlIHRoZSBsb2cgb2YsIGJ5IGRlZmF1bHQgaXRzIF9uYXR1cmFsIGxvZ18KKyBUaGUgc2Vjb25kIF9vcHRpb25hbF8gYXJndW1lbnQgY2FuIHNwZWNpZnkgYSBkaWZmZXJlbnQgYmFzZSByYXRoZXIgdGhhbiB0aGUgZGVmYXVsdCBgZWAuCgpGdW5jdGlvbnMgYWxzbyBfcmV0dXJuXyB2YWx1ZXMgZm9yIHVzIHRvIHVzZS4gCkluIHRoZSBjYXNlIG9mIGBsb2coKWAsIHRoZSByZXR1cm5lZCB2YWx1ZSBpcyB0aGUgbG9nJ2QgdmFsdWUgdGhlIGZ1bmN0aW9uIGNvbXB1dGVkLgoKYGBge3IgbG9nfQpsb2coNzMpCmBgYAoKSGVyZSB3ZSBjYW4gc3BlY2lmeSBhbiBfYXJndW1lbnRfIG9mIGBiYXNlYCB0byBjYWxjdWxhdGUgbG9nIGJhc2UgMy4gCgpgYGB7ciBsb2czfQpsb2coODEsIGJhc2UgPSAzKQpgYGAKCklmIHdlIGRvbid0IHNwZWNpZnkgdGhlIF9hcmd1bWVudF8gbmFtZXMsIGl0IGFzc3VtZXMgdGhleSBhcmUgaW4gdGhlIG9yZGVyIHRoYXQgYGxvZ2AgZGVmaW5lcyB0aGVtLiAKU2VlIGA/bG9nYCB0byBzZWUgbW9yZSBhYm91dCBpdHMgYXJndW1lbnRzLiAKCmBgYHtyIGxvZzIsIGxpdmUgPSBUUlVFfQpsb2coOCwgMikKYGBgCgpXZSBjYW4gc3dpdGNoIHRoZSBvcmRlciBpZiB3ZSBzcGVjaWZ5IHRoZSBhcmd1bWVudCBuYW1lcy4gCgpgYGB7ciBsb2ctb3JkZXJ9CmxvZyhiYXNlID0gMTAsIHggPSA0MzQyKQpgYGAKCldlIGNhbiBhbHNvIHByb3ZpZGUgdmFyaWFibGVzIGFzIGFyZ3VtZW50cyBpbiB0aGUgc2FtZSB3YXkgYXMgdGhlIHJhdyB2YWx1ZXMuIAoKYGBge3IgbG9nLXZhcmlhYmxlfQptZWFuaW5nIDwtIDQyCmxvZyhtZWFuaW5nKQpgYGAKCiMjIFdvcmtpbmcgd2l0aCB2YXJpYWJsZXMKCiMjIyBWYXJpYWJsZSBUeXBlcwoKVmFyaWFibGUgdHlwZXMgaW4gUiBjYW4gc29tZXRpbWVzIGJlIF9jb2VyY2VkXyAoY29udmVydGVkKSBmcm9tIG9uZSB0eXBlIHRvIGFub3RoZXIuCgpgYGB7cn0KIyBEZWZpbmUgYSB2YXJpYWJsZSB3aXRoIGEgbnVtYmVyCnggPC0gMTUKYGBgCgpUaGUgZnVuY3Rpb24gYGNsYXNzKClgIHdpbGwgdGVsbCB1cyB0aGUgdmFyaWFibGUncyB0eXBlLgoKYGBge3J9CmNsYXNzKHgpCmBgYAoKTGV0J3MgY29lcmNlIGl0IHRvIGEgY2hhcmFjdGVyLiAKCmBgYHtyfQp4IDwtIGFzLmNoYXJhY3Rlcih4KQpjbGFzcyh4KQpgYGAKClNlZSBpdCBub3cgaGFzIHF1b3RlcyBhcm91bmQgaXQ/IEl0J3Mgbm93IGEgY2hhcmFjdGVyIGFuZCB3aWxsIGJlaGF2ZSBhcyBzdWNoLgoKYGBge3J9CngKYGBgCgpVc2UgdGhpcyBjaHVuayB0byB0cnkgdG8gcGVyZm9ybSBjYWxjdWxhdGlvbnMgd2l0aCBgeGAsIG5vdyB0aGF0IGl0IGlzIGEgY2hhcmFjdGVyLCB3aGF0IGhhcHBlbnM/IAoKYGBge3IgbGl2ZSA9IFRSVUV9CiMgVHJ5IHRvIHBlcmZvcm0gY2FsY3VsYXRpb25zIG9uIGB4YApgYGAKCkJ1dCB3ZSBjYW4ndCBjb2VyY2UgZXZlcnl0aGluZzoKCmBgYHtyfQojIExldCdzIGNyZWF0ZSBhIGNoYXJhY3RlciB2YXJpYWJsZQp4IDwtICJsb29rIGF0IG15IGNoYXJhY3RlciB2YXJpYWJsZSIKYGBgCgpMZXQncyB0cnkgbWFraW5nIHRoaXMgYSBudW1lcmljIHZhcmlhYmxlOgoKYGBge3IgY29lcmNlLWNoYXIsIGVycm9yPVRSVUV9CnggPC0gYXMubnVtZXJpYyh4KQpgYGAKClByaW50IG91dCBgeGAuCgpgYGB7cn0KeApgYGAKClIgaXMgdGVsbGluZyB1cyBpdCBkb2Vzbid0IGtub3cgaG93IHRvIGNvbnZlcnQgdGhpcyB0byBhIG51bWVyaWMgdmFyaWFibGUsIHNvIGl0IGhhcyByZXR1cm5lZCBgTkFgIGluc3RlYWQuCgpGb3IgcmVmZXJlbmNlLCBoZXJlJ3MgYSBzdW1tYXJ5IG9mIHNvbWUgb2YgdGhlIG1vc3QgaW1wb3J0YW50IHZhcmlhYmxlIHR5cGVzLiAKCnwgVmFyaWFibGUgVHlwZSB8IERlZmluaXRpb24gfCBFeGFtcGxlcyB8IENvZXJjaW9uIHwKfC0tLS0tLS0tLS0tLS0tLXwtLS0tLS0tLS0tLS18LS0tLS0tLS0tLXwgLS0tLS0tLS18CnwgYG51bWVyaWNgICAgICAgIHwgQW55IG51bWJlciB2YWx1ZSB8IGA1YDxicj5gNy41YCA8YnI+YC0xYHwgYGFzLm51bWVyaWMoKWAKfCBgaW50ZWdlcmAgICAgICAgfCBBbnkgX3dob2xlXyBudW1iZXIgdmFsdWUgKG5vIGRlY2ltYWxzKSB8IGA1YCA8YnI+IGAtMTAwYCB8IGBhcy5pbnRlZ2VyKClgCnxgY2hhcmFjdGVyYCAgICAgIHwgQW55IGNvbGxlY3Rpb24gb2YgY2hhcmFjdGVycyBkZWZpbmVkIHdpdGhpbiBfcXVvdGF0aW9uIG1hcmtzXy4gQWxzbyBrbm93biBhcyBhICJzdHJpbmciLiB8IGAiYSJgIChhIHNpbmdsZSBsZXR0ZXIpIDxicj5gInN0cmluZ29mbGV0dGVycyJgIChhIHdob2xlIGJ1bmNoIG9mIGNoYXJhY3RlcnMgcHV0IHRvZ2V0aGVyIGFzIG9uZSkgPGJyPiBgInN0cmluZyBvZiBsZXR0ZXJzIGFuZCBzcGFjZXMiYCA8YnI+IGAiNSJgIDxicj4gYCdzaW5nbGUgcXVvdGVzIGFyZSBhbHNvIGdvb2QnYCB8IGBhcy5jaGFyYWN0ZXIoKWAKfGBsb2dpY2FsYCAgICAgIHwgQSB2YWx1ZSBvZiBgVFJVRWAsIGBGQUxTRWAsIG9yIGBOQWAgfCBgVFJVRWAgPGJyPiBgRkFMU0VgIDxicj4gYE5BYCAobm90IGRlZmluZWQpIHwgYGFzLmxvZ2ljYWwoKWAgCnxgZmFjdG9yYCAgICAgICB8IEEgc3BlY2lhbCB0eXBlIG9mIHZhcmlhYmxlIHRoYXQgZGVub3RlcyBzcGVjaWZpYyBjYXRlZ29yaWVzIG9mIGEgY2F0ZWdvcmljYWwgdmFyaWFibGUgfCAoc3RheSB0dW5lZC4uKSB8IGBhcy5mYWN0b3IoKWAKCiMjIyBWZWN0b3JzCgpZb3Ugd2lsbCBoYXZlIG5vdGljZWQgdGhhdCBhbGwgeW91ciBjb21wdXRhdGlvbnMgdGVuZCB0byBwb3AgdXAgd2l0aCBhIGBbMV1gIHByZWNlZGluZyB0aGVtIGluIFIncyBvdXRwdXQuIApUaGlzIGlzIGJlY2F1c2UsIGluIGZhY3QsIGFsbCAob2sgbW9zdGx5IGFsbCkgdmFyaWFibGVzIGFyZSBfYnkgZGVmYXVsdF8gIHZlY3RvcnMsIGFuZCBvdXIgYW5zd2VycyBhcmUgdGhlIGZpcnN0IChpbiB0aGVzZSBjYXNlcyBvbmx5KSB2YWx1ZSBpbiB0aGUgdmVjdG9yLiAKQXMgdmVjdG9ycyBnZXQgbG9uZ2VyLCBuZXcgaW5kZXggaW5kaWNhdG9ycyB3aWxsIGFwcGVhciBhdCB0aGUgc3RhcnQgb2YgbmV3IGxpbmVzLiAKCmBgYHtyfQojIFRoaXMgaXMgYWN0dWFsbHkgYW4gdmVjdG9yIHRoYXQgaGFzIG9uZSBpdGVtIGluIGl0Lgp4IDwtIDcKYGBgCgpgYGB7ciB2ZWN0b3ItbGVuZ3RofQojIFRoZSBsZW5ndGgoKSBmdW5jdGlvbnMgdGVsbHMgdXMgaG93IGxvbmcgYW4gdmVjdG9yIGlzOgpsZW5ndGgoeCkKYGBgCgpXZSBjYW4gZGVmaW5lIHZlY3RvcnMgd2l0aCB0aGUgZnVuY3Rpb24gYGMoKWAsIHdoaWNoIHN0YW5kcyBmb3IgImNvbWJpbmUiLiAKVGhpcyBmdW5jdGlvbiB0YWtlcyBhIGNvbW1hLXNlcGFyYXRlZCBzZXQgb2YgdmFsdWVzIHRvIHBsYWNlIGluIHRoZSB2ZWN0b3IsIGFuZCByZXR1cm5zIHRoZSB2ZWN0b3IgaXRzZWxmOgoKYGBge3IgbWFrZS12ZWN0b3J9Cm15X251bWVyaWNfdmVjdG9yIDwtIGMoMSwgMSwgMiwgMywgNSwgOCwgMTMsIDIxKQpteV9udW1lcmljX3ZlY3RvcgpgYGAKCldlIGNhbiBidWlsZCBvbiB2ZWN0b3JzIGluIHBsYWNlIGJ5IHJlZGVmaW5pbmcgdGhlbToKCmBgYHtyIGZpYmJvbmFjY2ksIGxpdmUgPSBUUlVFfQojIGFkZCB0aGUgbmV4dCB0d28gRmlib25hY2NpIG51bWJlcnMgdG8gdGhlIHNlcmllcy4KbXlfbnVtZXJpY192ZWN0b3IgPC0gYyhteV9udW1lcmljX3ZlY3RvciwgMzQsIDU1KQpteV9udW1lcmljX3ZlY3RvcgpgYGAKCldlIGNhbiBwdWxsIG91dCBzcGVjaWZpYyBpdGVtcyBmcm9tIGFuIHZlY3RvciB1c2luZyBhIHByb2Nlc3MgY2FsbGVkIF9pbmRleGluZ18sIHdoaWNoIHVzZXMgYnJhY2tldHMgYFtdYCB0byBzcGVjaWZ5IHRoZSBwb3NpdGlvbiBvZiBhbiBpdGVtLiAKCmBgYHtyIHN1YnNldDF9CiMgR3JhYiB0aGUgZm91cnRoIHZhbHVlIGZyb20gbXlfbnVtZXJpY192ZWN0b3IKIyBUaGlzIGdpdmVzIHVzIGFuIHZlY3RvciBvZiBsZW5ndGggMSAKbXlfbnVtZXJpY192ZWN0b3JbNF0KYGBgCgpDb2xvbnMgYXJlIGFsc28gYSBuaWNlIHdheSB0byBxdWlja2x5IG1ha2Ugb3JkZXJlZCBudW1lcmljIHZlY3RvcnMKVXNlIGEgY29sb24gdG8gc3BlY2lmeSBhbiBpbmNsdXNpdmUgcmFuZ2Ugb2YgaW5kaWNlcwpUaGlzIHdpbGwgcmV0dXJuIGFuIHZlY3RvciB3aXRoIDIsIDMsIDQsIGFuZCA1LgoKYGBge3Igc3Vic2V0LW1hbnl9Cm15X251bWVyaWNfdmVjdG9yWzI6NV0KYGBgCgpPbmUgbWFqb3IgYmVuZWZpdCBvZiB2ZWN0b3JzIGlzIHRoZSBjb25jZXB0IG9mICoqdmVjdG9yaXphdGlvbioqLCB3aGVyZSBSIGJ5IGRlZmF1bHQgcGVyZm9ybXMgb3BlcmF0aW9ucyBvbiB0aGUgX2VudGlyZSB2ZWN0b3IgYXQgb25jZV8uIApGb3IgZXhhbXBsZSwgd2UgY2FuIGdldCB0aGUgbG9nIG9mIGFsbCBudW1iZXJzIDEtMjAgd2l0aCBhIHNpbmdsZSwgc2ltcGxlIGNhbGwsIGFuZCBtb3JlIQoKYGBge3IgdmVjdG9yaXplfQp2YWx1ZXNfMV90b18yMCA8LSAxOjIwCmBgYAoKCmBgYHtyIHZlY3Rvcml6ZS1sb2csIGxpdmUgPSBUUlVFfQojIGNhbGN1bGF0ZSB0aGUgbG9nIG9mIHZhbHVlc18xX3RvXzIwCmxvZyh2YWx1ZXNfMV90b18yMCkKYGBgCgpGaW5hbGx5LCB3ZSBjYW4gYXBwbHkgbG9naWNhbCBleHByZXNzaW9ucyB0byB2ZWN0b3JzLCBqdXN0IGFzIHdlIGNhbiBkbyBmb3Igc2luZ2xlIHZhbHVlcy4KVGhlIG91dHB1dCBoZXJlIGlzIGEgbG9naWNhbCB2ZWN0b3IgdGVsbGluZyB1cyB3aGV0aGVyIGVhY2ggdmFsdWUgaW4gZXhhbXBsZV92ZWN0b3IgaXMgVFJVRSBvciBGQUxTRQoKYGBge3IgdmVjdG9yLWNvbXBhcmV9CiMgV2hpY2ggdmFsdWVzIGFyZSA8PSAzPwp2YWx1ZXNfMV90b18yMCA8PSAzCmBgYAoKVGhlcmUgYXJlIHNldmVyYWwga2V5IGZ1bmN0aW9ucyB3aGljaCBjYW4gYmUgdXNlZCBvbiB2ZWN0b3JzIGNvbnRhaW5pbmcgbnVtZXJpYyB2YWx1ZXMsIHNvbWUgb2Ygd2hpY2ggYXJlIGJlbG93LgoKKyBgbWVhbigpYDogVGhlIGF2ZXJhZ2UgdmFsdWUgaW4gdGhlIHZlY3RvcgorIGBtaW4oKWA6IFRoZSBtaW5pbXVtIHZhbHVlIGluIHRoZSB2ZWN0b3IKKyBgbWF4KClgOiBUaGUgbWF4aW11bSB2YWx1ZSBpbiB0aGUgdmVjdG9yCisgYHN1bSgpYDogVGhlIHN1bSBvZiBhbGwgdmFsdWVzIGluIHRoZSB2ZWN0b3IKCldlIGNhbiB0cnkgb3V0IHRoZXNlIGZ1bmN0aW9ucyBvbiB0aGUgdmVjdG9yIGB2YWx1ZXNfMV90b18yMGAgd2UndmUgY3JlYXRlZC4gCgpgYGB7ciB2ZWN0b3ItZnVuY3N9Cm1lYW4odmFsdWVzXzFfdG9fMjApCgojIFRyeSBvdXQgc29tZSBvZiB0aGUgb3RoZXIgZnVuY3Rpb25zIHdlJ3ZlIGxpc3RlZCBhYm92ZSAKCmBgYAoKIyMjIEEgbm90ZSBvbiB2YXJpYWJsZSBuYW1pbmcKCldlIGhhdmUgbGVhcm5lZCBmdW5jdGlvbnMgc3VjaCBhcyBgY2AsIGBsZW5ndGhgLCBgc3VtYCwgYW5kIGV0Yy4gCkltYWdpbmUgZGVmaW5pbmcgYSB2YXJpYWJsZSBjYWxsZWQgYGNgOiBUaGlzIHdpbGwgd29yaywgYnV0IGl0IHdpbGwgbGVhZCB0byBhIApsb3Qgb2YgdW5pbnRlbmRlZCBidWdzLCBzbyBpdCdzIGJlc3QgdG8gYXZvaWQgdGhpcy4gCgojIyMgVGhlIGAlaW4lYCBsb2dpY2FsIG9wZXJhdG9yIAoKYCVpbiVgIGlzIHVzZWZ1bCBmb3IgZGV0ZXJtaW5pbmcgd2hldGhlciBhIGdpdmVuIGl0ZW0ocykgYXJlIGluIGFuIHZlY3Rvci4KCmBgYHtyIGluLW9wZXJhdG9yfQojIGlzIGA3YCBpbiBvdXIgdmVjdG9yPyAKNyAlaW4lIHZhbHVlc18xX3RvXzIwCmBgYAoKYGBge3IgaW4yLCBsaXZlID0gVFJVRX0KIyBpcyBgNTBgIGluIG91ciB2ZWN0b3I/IAo1MCAlaW4lIHZhbHVlc18xX3RvXzIwCmBgYAoKV2UgY2FuIHRlc3QgYSB2ZWN0b3Igb2YgdmFsdWVzIGJlaW5nIHdpdGhpbiBhbm90aGVyIHZlY3RvciBvZiB2YWx1ZXMuIAoKYGBge3IgdmVjdG9yLWluLCBsaXZlID0gVFJVRX0KcXVlc3Rpb25fdmFsdWVzIDwtIGMoMTozLCA3LCA1MCkKIyBBcmUgdGhlc2UgdmFsdWVzIGluIG91ciB2ZWN0b3I/CnF1ZXN0aW9uX3ZhbHVlcyAlaW4lIHZhbHVlc18xX3RvXzIwCmBgYAoKIyMgRGF0YSBmcmFtZXMKCl9EYXRhIGZyYW1lcyBhcmUgb25lIG9mIHRoZSBtb3N0IHVzZWZ1bCB0b29scyBmb3IgZGF0YSBhbmFseXNpcyBpbiBSLl8gClRoZXkgYXJlIHRhYmxlcyB3aGljaCBjb25zaXN0IG9mIHJvd3MgYW5kIGNvbHVtbnMsIG11Y2ggbGlrZSBhIF9zcHJlYWRzaGVldF8uIApFYWNoIGNvbHVtbiBpcyBhIHZhcmlhYmxlIHdoaWNoIGJlaGF2ZXMgYXMgYSBfdmVjdG9yXywgYW5kIGVhY2ggcm93IGlzIGFuIG9ic2VydmF0aW9uLiAKV2Ugd2lsbCBiZWdpbiBvdXIgZXhwbG9yYXRpb24gd2l0aCBkYXRhc2V0IG9mIG1lYXN1cmVtZW50cyBmcm9tIHRocmVlIHBlbmd1aW4gc3BlY2llcyBtZWFzdXJlZCwgd2hpY2ggd2UgY2FuIGZpbmQgaW4gdGhlIFtgcGFsbWVycGVuZ3VpbnNgIHBhY2thZ2VdKGh0dHBzOi8vYWxsaXNvbmhvcnN0LmdpdGh1Yi5pby9wYWxtZXJwZW5ndWlucy8pLiAKV2UnbGwgdGFsayBtb3JlIGFib3V0IHBhY2thZ2VzIHNvb24hClRvIHVzZSB0aGlzIGRhdGFzZXQsIHdlIHdpbGwgbG9hZCBpdCBmcm9tIHRoZSBgcGFsbWVycGVuZ3VpbnNgIHBhY2thZ2UgdXNpbmcgYSBgOjpgIChtb3JlIG9uIHRoaXMgbGF0ZXIpIGFuZCBhc3NpZ24gaXQgdG8gYSB2YXJpYWJsZSBuYW1lZCBgcGVuZ3VpbnNgIGluIG91ciBjdXJyZW50IGVudmlyb25tZW50LgoKYGBge3IgcGVuZ3Vpbi1saWJyYXJ5fQpwZW5ndWlucyA8LSBwYWxtZXJwZW5ndWluczo6cGVuZ3VpbnMKYGBgCgohW2RyYXdpbmdzIG9mIHBlbmd1aW4gc3BlY2llc10oZGlhZ3JhbXMvbHRlcl9wZW5ndWlucy5wbmcpIEFydHdvcmsgYnkgW0BhbGxpc29uX2hvcnN0XShodHRwczovL3R3aXR0ZXIuY29tL2FsbGlzb25faG9yc3QpCgojIyMgRXhwbG9yaW5nIGRhdGEgZnJhbWVzCgpUaGUgZmlyc3Qgc3RlcCB0byB1c2luZyBhbnkgZGF0YSBpcyB0byBsb29rIGF0IGl0ISEhIApSU3R1ZGlvIGNvbnRhaW5zIGEgc3BlY2lhbCBmdW5jdGlvbiBgVmlldygpYCB3aGljaCBhbGxvd3MgeW91IHRvIGxpdGVyYWxseSB2aWV3IGEgdmFyaWFibGUuCllvdSBjYW4gYWxzbyBjbGljayBvbiB0aGUgb2JqZWN0IGluIHRoZSBlbnZpcm9ubWVudCBwYW5lIHRvIHNlZSBpdHMgb3ZlcmFsbCBwcm9wZXJ0aWVzLCBvciBjbGljayB0aGUgdGFibGUgaWNvbiBvbiB0aGUgb2JqZWN0J3Mgcm93IHRvIGF1dG9tYXRpY2FsbHkgdmlldyB0aGUgdmFyaWFibGUuIAoKU29tZSB1c2VmdWwgZnVuY3Rpb25zIGZvciBleHBsb3Jpbmcgb3VyIGRhdGEgZnJhbWUgaW5jbHVkZToKCisgYGhlYWQoKWAgdG8gc2VlIHRoZSBmaXJzdCA2IHJvd3Mgb2YgYSBkYXRhIGZyYW1lLiBBZGRpdGlvbmFsIGFyZ3VtZW50cyBzdXBwbGllZCBjYW4gY2hhbmdlIHRoZSBudW1iZXIgb2Ygcm93cy4KKyBgdGFpbCgpYCB0byBzZWUgdGhlIGxhc3QgNiByb3dzIG9mIGEgZGF0YSBmcmFtZS4gQWRkaXRpb25hbCBhcmd1bWVudHMgc3VwcGxpZWQgY2FuIGNoYW5nZSB0aGUgbnVtYmVyIG9mIHJvd3MuCisgYG5hbWVzKClgIHRvIHNlZSB0aGUgY29sdW1uIG5hbWVzIG9mIHRoZSBkYXRhIGZyYW1lLgorIGBucm93KClgIHRvIHNlZSBob3cgbWFueSByb3dzIGFyZSBpbiB0aGUgZGF0YSBmcmFtZQorIGBuY29sKClgIHRvIHNlZSBob3cgbWFueSBjb2x1bW5zIGFyZSBpbiB0aGUgZGF0YSBmcmFtZS4KCldlIGNhbiBhZGRpdGlvbmFsbHkgZXhwbG9yZSBfb3ZlcmFsbCBwcm9wZXJ0aWVzXyBvZiB0aGUgZGF0YSBmcmFtZSB3aXRoIHR3byBkaWZmZXJlbnQgZnVuY3Rpb25zOiBgc3VtbWFyeSgpYCBhbmQgYHN0cigpYC4KClRoaXMgcHJvdmlkZXMgc3VtbWFyeSBzdGF0aXN0aWNzIGZvciBlYWNoIGNvbHVtbjoKCmBgYHtyIHBlbmd1aW5zLXN1bW1hcnl9CnN1bW1hcnkocGVuZ3VpbnMpCmBgYAoKVGhpcyBwcm92aWRlcyBhIHNob3J0IHZpZXcgb2YgdGhlICoqc3RyKip1Y3R1cmUgYW5kIGNvbnRlbnRzIG9mIHRoZSBkYXRhIGZyYW1lLgoKYGBge3IgcGVuZ3VpbnMtc3RyfQpzdHIocGVuZ3VpbnMpCmBgYAoKWW91J2xsIG5vdGljZSB0aGF0IHRoZSBjb2x1bW4gYHNwZWNpZXNgIGlzIGEgX2ZhY3Rvcl86IFRoaXMgaXMgYSBzcGVjaWFsIHR5cGUgb2YgY2hhcmFjdGVyIHZhcmlhYmxlIHRoYXQgcmVwcmVzZW50cyBkaXN0aW5jdCBjYXRlZ29yaWVzIGtub3duIGFzICJsZXZlbHMiLiAKV2UgaGF2ZSBsZWFybmVkIGhlcmUgdGhhdCB0aGVyZSBhcmUgdGhyZWUgbGV2ZWxzIGluIHRoZSBgc3BlY2llc2AgY29sdW1uOiBBZGVsaWUsIENoaW5zdHJhcCwgYW5kIEdlbnRvby4KV2UgbWlnaHQgd2FudCB0byBleHBsb3JlIGluZGl2aWR1YWwgY29sdW1ucyBvZiB0aGUgZGF0YSBmcmFtZSBtb3JlIGluLWRlcHRoLiAKV2UgY2FuIGV4YW1pbmUgaW5kaXZpZHVhbCBjb2x1bW5zIHVzaW5nIHRoZSBkb2xsYXIgc2lnbiBgJGAgdG8gc2VsZWN0IG9uZSBieSBuYW1lOgoKYGBge3IgcGVuZ3VpbnMtc3Vic2V0fQojIEV4dHJhY3QgYmlsbF9sZW5ndGhfbW0gYXMgYSB2ZWN0b3IKcGVuZ3VpbnMkYmlsbF9sZW5ndGhfbW0KCiMgaW5kZXhpbmcgb3BlcmF0b3JzIGNhbiBiZSB1c2VkIG9uIHRoZXNlIHZlY3RvcnMgdG9vCnBlbmd1aW5zJGJpbGxfbGVuZ3RoX21tWzE6MTBdCmBgYAoKV2UgY2FuIHBlcmZvcm0gb3VyIHJlZ3VsYXIgdmVjdG9yIG9wZXJhdGlvbnMgb24gY29sdW1ucyBkaXJlY3RseS4KCmBgYHtyIHBlbmd1aW5zLWNvbC1tZWFuLCBsaXZlID0gVFJVRX0KIyBjYWxjdWxhdGUgdGhlIG1lYW4gb2YgdGhlIGJpbGxfbGVuZ3RoX21tIGNvbHVtbgptZWFuKHBlbmd1aW5zJGJpbGxfbGVuZ3RoX21tLAogICAgIG5hLnJtID0gVFJVRSkgIyByZW1vdmUgbWlzc2luZyB2YWx1ZXMgYmVmb3JlIGNhbGN1bGF0aW5nIHRoZSBtZWFuCmBgYAoKV2UgY2FuIGFsc28gY2FsY3VsYXRlIHRoZSBmdWxsIHN1bW1hcnkgc3RhdGlzdGljcyBmb3IgYSBzaW5nbGUgY29sdW1uIGRpcmVjdGx5LiAKCmBgYHtyIHBlbmd1aW5zLWNvbC1zdW1tYXJ5LCBsaXZlID0gVFJVRX0KIyBzaG93IGEgc3VtbWFyeSBvZiB0aGUgYmlsbF9sZW5ndGhfbW0gY29sdW1uCnN1bW1hcnkocGVuZ3VpbnMkYmlsbF9sZW5ndGhfbW0pCmBgYAoKRXh0cmFjdCBgc3BlY2llc2AgYXMgYSB2ZWN0b3IgYW5kIHN1YnNldCBpdCB0byBzZWUgYSBwcmV2aWV3LgoKYGBge3IgcGVuZ3VpbnMtY29sLXN1YnNldCwgbGl2ZSA9IFRSVUV9CiMgZ2V0IHRoZSBmaXJzdCAxMCB2YWx1ZXMgb2YgdGhlIHNwZWNpZXMgY29sdW1uCnBlbmd1aW5zJHNwZWNpZXNbMToxMF0KYGBgCgpBbmQgdmlldyBpdHMgX2xldmVsc18gd2l0aCB0aGUgYGxldmVscygpYCBmdW5jdGlvbi4KCmBgYHtyIHBlbmd1aW4tbGV2ZWxzfQpsZXZlbHMocGVuZ3VpbnMkc3BlY2llcykKYGBgCgojIyBGaWxlcyBhbmQgZGlyZWN0b3JpZXMKCkluIG1hbnkgc2l0dWF0aW9ucywgd2Ugd2lsbCBiZSByZWFkaW5nIGluIHRhYnVsYXIgZGF0YSBmcm9tIGEgZmlsZSBhbmQgdXNpbmcgaXQgYXMgYSBkYXRhIGZyYW1lLiAKVG8gcHJhY3RpY2UsIHdlIHdpbGwgcmVhZCBpbiBhIGZpbGUgd2Ugd2lsbCBiZSB1c2luZyBpbiB0aGUgbmV4dCBub3RlYm9vayBhcyB3ZWxsLCBgZ2VuZV9yZXN1bHRzX0dTRTQ0OTcxLnRzdmAsIGluIHRoZSBgZGF0YWAgZm9sZGVyLiAKRmlsZSBwYXRocyBhcmUgcmVsYXRpdmUgdG8gdGhlIGxvY2F0aW9uIHdoZXJlIHRoaXMgbm90ZWJvb2sgZmlsZSAoLlJtZCkgaXMgc2F2ZWQuCgpIZXJlIHdlIHdpbGwgdXNlIGEgZnVuY3Rpb24sIGByZWFkX3RzdigpYCBmcm9tIHRoZSBgcmVhZHJgIHBhY2thZ2UuCkJlZm9yZSB3ZSBhcmUgYWJsZSB0byB1c2UgdGhlIGZ1bmN0aW9uLCB3ZSBoYXZlIHRvIGxvYWQgdGhlIHBhY2thZ2UgdXNpbmcgYGxpYnJhcnkoKWAuIAoKYGBge3IgcmVhZHJ9CmxpYnJhcnkocmVhZHIpCmBgYAoKYGZpbGUucGF0aCgpYCBjcmVhdGVzIGEgcHJvcGVybHkgZm9ybWF0dGVkIGZpbGUgcGF0aCBieSBhZGRpbmcgYSBwYXRoIHNlcGFyYXRvciAoYC9gIG9uIE1hYyBhbmQgTGludXggb3BlcmF0aW5nIHN5c3RlbXMsIHRoZSBsYXR0ZXIgb2Ygd2hpY2ggaXMgdGhlIG9wZXJhdGluZyBzeXN0ZW0gdGhhdCBvdXIgUlN0dWRpbyBTZXJ2ZXIgcnVucyBvbikgYmV0d2VlbiBzZXBhcmF0ZSBmb2xkZXJzIG9yIGRpcmVjdG9yaWVzLgpCZWNhdXNlIGZpbGUgcGF0aCBzZXBhcmF0b3JzIGNhbiBkaWZmZXIgYmV0d2VlbiB5b3VyIGNvbXB1dGVyIGFuZCB0aGUgY29tcHV0ZXIgb2Ygc29tZW9uZSB3aG8gd2FudHMgdG8gdXNlIHlvdXIgY29kZSwgd2UgdXNlIGBmaWxlLnBhdGgoKWAgaW5zdGVhZCBvZiB0eXBpbmcgb3V0IGAiZGF0YS9nZW5lX3Jlc3VsdHNfR1NFNDQ5NzEudHN2ImAuCkVhY2ggX2FyZ3VtZW50XyB0byBgZmlsZS5wYXRoKClgIGlzIGEgZGlyZWN0b3J5IG9yIGZpbGUgbmFtZS4KWW91J2xsIG5vdGljZSBlYWNoIGFyZ3VtZW50IGlzIGluIHF1b3Rlcywgd2Ugc3BlY2lmeSBgZGF0YWAgZmlyc3QgYmVjYXVzZSB0aGUgZmlsZSwgYGdlbmVfcmVzdWx0c19HU0U0NDk3MS50c3ZgIGlzIGluIHRoZSBgZGF0YWAgZm9sZGVyLiAKCmBgYHtyIGZpbGUucGF0aH0KZmlsZS5wYXRoKCJkYXRhIiwgImdlbmVfcmVzdWx0c19HU0U0NDk3MS50c3YiKQpgYGAKCkFzIHlvdSBjYW4gc2VlIGFib3ZlLCB0aGUgcmVzdWx0IG9mIHJ1bm5pbmcgYGZpbGUucGF0aCgpYCBpcyB0aGF0IGl0IF9jcmVhdGVzIGEgc3RyaW5nXyB3aXRoIGFuIGFjY3VyYXRlbHktZm9ybWF0dGVkIHBhdGggZm9yIHlvdXIgZmlsZSBzeXN0ZW0uClRoaXMgc3RyaW5nIGNhbiBiZSB1c2VkIG1vdmluZyBmb3J3YXJkIHdoZW4geW91IG5lZWQgdG8gcmVmZXIgdG8gdGhlIHBhdGggdG8geW91ciBmaWxlLgpMZXQncyBnbyBhaGVhZCBhbmQgc3RvcmUgdGhpcyBmaWxlIHBhdGggYXMgYSB2YXJpYWJsZSBpbiBvdXIgZW52aXJvbm1lbnQuIAoKYGBge3IgZmlsZS5wYXRoLXZhcmlhYmxlfQpnZW5lX2ZpbGVfcGF0aCA8LSBmaWxlLnBhdGgoImRhdGEiLCAiZ2VuZV9yZXN1bHRzX0dTRTQ0OTcxLnRzdiIpCmBgYAoKTm93IHdlIGFyZSByZWFkeSB0byB1c2UgYHJlYWRfdHN2KClgIHRvIHJlYWQgdGhlIGZpbGUgaW50byBSLgpUaGUgcmVzdWx0aW5nIGRhdGEgZnJhbWUgd2lsbCBiZSBzdG9yZWQgaW4gYSB2YXJpYWJsZSBuYW1lZCBgc3RhdHNfZGZgLgpOb3RlIHRoZSBgPC1gIChhc3NpZ25tZW50IG9wZXJhdG9yISkgaXMgcmVzcG9uc2libGUgZm9yIHNhdmluZyB0aGlzIHRvIG91ciBnbG9iYWwgZW52aXJvbm1lbnQuIAoKYGBge3IgcmVhZC1zdGF0c30KIyByZWFkIGluIHRoZSBmaWxlIGBnZW5lX3Jlc3VsdHNfR1NFNDQ5NzEudHN2YCBmcm9tIHRoZSBkYXRhIGRpcmVjdG9yeQpzdGF0c19kZiA8LSByZWFkX3RzdihnZW5lX2ZpbGVfcGF0aCkKYGBgCgpUYWtlIGEgbG9vayBhdCB5b3VyIGVudmlyb25tZW50IHBhbmVsIHRvIHNlZSB3aGF0IGBzdGF0c19kZmAgbG9va3MgbGlrZS4gCldlIGNhbiBhbHNvIHByaW50IG91dCBhIHByZXZpZXcgb2YgdGhlIGBzdGF0c19kZmAgZGF0YSBmcmFtZSBoZXJlLiAKCmBgYHtyIHNob3ctc3RhdHMsIGxpdmUgPSBUUlVFfQojIGRpc3BsYXkgc3RhdHNfZGYKc3RhdHNfZGYKYGBgCgojIyMgU2Vzc2lvbiBJbmZvCgpBdCB0aGUgZW5kIG9mIGV2ZXJ5IG5vdGVib29rLCB5b3Ugd2lsbCBzZWUgdXMgcHJpbnQgb3V0IGBzZXNzaW9uSW5mb2AuIApUaGlzIGFpZHMgaW4gdGhlIHJlcHJvZHVjaWJpbGl0eSBvZiB5b3VyIGNvZGUgYnkgc2hvd2luZyBleGFjdGx5IHdoYXQgcGFja2FnZXMgCmFuZCB2ZXJzaW9ucyB3ZXJlIGJlaW5nIHVzZWQgdGhlIGxhc3QgdGltZSB0aGUgbm90ZWJvb2sgd2FzIHJ1bi4KCmBgYHtyfQpzZXNzaW9uSW5mbygpCmBgYAo=
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/intro-to-R-tidyverse/02-intro_to_ggplot2.nb.html b/completed-notebooks/intro-to-R-tidyverse/02-intro_to_ggplot2.nb.html new file mode 100644 index 0000000..929e4c1 --- /dev/null +++ b/completed-notebooks/intro-to-R-tidyverse/02-intro_to_ggplot2.nb.html @@ -0,0 +1,3666 @@ + + + + + + + + + + + + + + + +Introduction to ggplot2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Load and use R packages
  • +
  • Read in and perform simple manipulations of data frames
  • +
  • Use ggplot2 to plot and visualize data
  • +
  • Customize plots using features of ggplot2
  • +
+
+

We’ll use a real gene expression dataset to get comfortable making +visualizations using ggplot2. We’ve performed differential expression +analyses on a pre-processed astrocytoma +microarray dataset. We’ll start by making a volcano plot of +differential gene expression results from this experiment. We performed +three sets of contrasts:

+
    +
  1. sex category contrasting: Male vs +Female
  2. +
  3. tissue category contrasting : +Pilocytic astrocytoma tumor samples vs +normal cerebellum samples
  4. +
  5. An interaction of both sex and +tissue.
  6. +
+

More ggplot2 resources:

+ +
+
+

Set Up

+

We saved these results to a tab separated values (TSV) file called +gene_results_GSE44971.tsv. It’s been saved to the +data folder. File paths are relative to where this notebook +file (.Rmd) is saved. So we can reference it later, let’s make a +variable with our data directory name.

+ + + +
data_dir <- "data"
+ + + +

Let’s declare our output folder name as its own variable.

+ + + +
plots_dir <- "plots"
+ + + +

We can also create a directory if it doesn’t already exist.

+

There’s a couple ways that we can create that directory from within +R. One way is to use the base R function dir.create(), +which (as the name suggests) will create a directory. But this function +assumes that the directory does not yet exist, and it will throw an +error if you try to create a directory that already exists. To avoid +this error, we can place the directory creation inside an +if statement, so the code will only run if the directory +does not yet exist:

+ + + +
# The if statement here tests whether the plot directory exists and
+# only executes the expressions between the braces if it does not.
+if (!dir.exists(plots_dir)) {
+  dir.create(plots_dir)
+}
+ + + +

In this notebook we will be using functions from the Tidyverse set of +packages, so we need to load in those functions using +library(). We could load the individual packages we need +one at a time, but it is convenient for now to load them all with the +tidyverse “package,” which groups many of them together as +a shortcut. Keep a look out for where we tell you which individual +package different functions come from.

+ + + +
library(tidyverse)
+ + +
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
+✔ dplyr     1.1.4     ✔ readr     2.1.5
+✔ forcats   1.0.0     ✔ stringr   1.5.1
+✔ ggplot2   3.5.1     ✔ tibble    3.2.1
+✔ lubridate 1.9.3     ✔ tidyr     1.3.1
+✔ purrr     1.0.2     
+── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
+✖ dplyr::filter() masks stats::filter()
+✖ dplyr::lag()    masks stats::lag()
+ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
+ + + +
+
+

Read in the differential expression analysis results file

+

Here we are using a tidyverse function +read_tsv() from the readr package. Like we did +in the previous notebook, we will store the resulting data frame as +stats_df.

+ + + +
# read in the file `gene_results_GSE44971.tsv` from the data directory
+stats_df <- read_tsv(file.path(
+  data_dir,
+  "gene_results_GSE44971.tsv"
+))
+ + +
Rows: 6804 Columns: 8
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr (3): ensembl_id, gene_symbol, contrast
+dbl (5): log_fold_change, avg_expression, t_statistic, p_value, adj_p_value
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

We can take a look at a column individually by using a +$. Note we are using head() so the whole thing +doesn’t print out.

+ + + +
head(stats_df$contrast)
+ + +
[1] "male_female" "male_female" "male_female" "male_female" "male_female"
+[6] "male_female"
+ + +
male_female
+male_female
+male_female
+male_female
+male_female
+male_female
+ + + +

If we want to see a specific set of values, we can use brackets with +the indices of the values we’d like returned.

+ + + +
stats_df$avg_expression[6:10]
+ + +
[1] 19.084011  8.453933  5.116563  6.345609 25.473133
+ + + +

Let’s look at some basic statistics from the data set using +summary()

+ + + +
# summary of stats_df
+summary(stats_df)
+ + +
  ensembl_id        gene_symbol          contrast         log_fold_change    
+ Length:6804        Length:6804        Length:6804        Min.   :-180.8118  
+ Class :character   Class :character   Class :character   1st Qu.:  -1.6703  
+ Mode  :character   Mode  :character   Mode  :character   Median :   0.1500  
+                                                          Mean   :   0.2608  
+                                                          3rd Qu.:   2.1049  
+                                                          Max.   : 129.3009  
+ avg_expression     t_statistic           p_value         adj_p_value     
+ Min.   :  5.003   Min.   :-32.84581   Min.   :0.00000   Min.   :0.00000  
+ 1st Qu.:  6.304   1st Qu.: -1.16444   1st Qu.:0.01309   1st Qu.:0.05657  
+ Median :  8.482   Median :  0.10619   Median :0.18919   Median :0.41354  
+ Mean   : 13.847   Mean   : -0.00819   Mean   :0.31223   Mean   :0.44833  
+ 3rd Qu.: 14.022   3rd Qu.:  1.46589   3rd Qu.:0.57634   3rd Qu.:0.82067  
+ Max.   :190.708   Max.   : 10.48302   Max.   :0.99979   Max.   :0.99988  
+ + + +

The statistics for contrast are not very informative, so +let’s do that again with just the contrast column after +converting it to a factor

+ + + +
# summary of `stats_df$contrast` as a factor
+summary(as.factor(stats_df$contrast))
+ + +
astrocytoma_normal        interaction        male_female 
+              2268               2268               2268 
+ + + +
+
+

Set up the dataset

+

Before we make our plot, we want to calculate a set of new values for +each row; transformations of the raw statistics in our table. To do this +we will use a function from the dplyr package called +mutate() to make a new column of -log10 p values.

+ + + +
# add a `neg_log10_p` column to the data frame
+stats_df <- mutate(stats_df, # data frame we'd like to add a variable to
+  neg_log10_p = -log10(p_value) # column name and values
+)
+ + + +

Let’s filter to only male_female contrast data. First +let’s try out a logical expression:

+ + + +
stats_df$contrast == "male_female"
+ + + +

Now we can try out the filter() function. Notice that we +are not assigning the results to a variable, so this filtered dataset +will not be saved to the environment.

+ + + +
# filter stats_df to "male_female" only
+filter(stats_df, contrast == "male_female")
+ +
+ +
+ + +

Now we can assign the results to a new data frame: +male_female_df.

+ + + +
# filter and save to male_female_df
+male_female_df <- filter(stats_df, contrast == "male_female")
+ + + +
+
+

Plotting this data

+

Let’s make a volcano plot with this data. First let’s take a look at +only the tumor vs. normal comparison. Let’s save this as a separate data +frame by assigning it a new name.

+ + + +
tumor_normal_df <- filter(stats_df, contrast == "astrocytoma_normal")
+ + + +

To make this plot we will be using functions from the +ggplot2 package, the main plotting package of the +tidyverse. We use the first function, ggplot() to define +the data that will be plotted. Remember, the name of this package is +ggplot2, but the function we use is called +ggplot() without the 2. ggplot() +takes two main arguments:

+
    +
  1. data, which is the data frame that contains the data we +want to plot.
  2. +
  3. mapping, which is a special list made with the +aes() function to describe which values will be used for +each aesthetic component of the plot, such as the x and +y coordinates of each point. (If you find calling things like the x and +y coordinates “aesthetics” confusing, don’t worry, you are not alone.) +Specifically, the aes() function is used to specify that a +given column (variable) in your data frame be mapped to a given +aesthetic component of the plot.
  4. +
+ + + +
ggplot(
+  tumor_normal_df, # This first argument is the data frame with the data we want to plot
+  aes(
+    x = log_fold_change, # This is the column name of the values we want to use
+    # for the x coordinates
+    y = neg_log10_p
+  ) # This is the column name of the data we want for the y-axis
+)
+ + +

+ + + +

You’ll notice this plot doesn’t have anything on it because we +haven’t specified a plot type yet. To do that, we will add another +ggplot layer with + which will specify exactly what we want +to plot. A volcano plot is a special kind of scatter plot, so to make +that we will want to plot individual points, which we can do with +geom_point().

+ + + +
# This first part is the same as before
+ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p
+  )
+) +
+  # Now we are adding on a layer to specify what kind of plot we want
+  geom_point()
+ + +

+ + + +

Here’s a brief summary of ggplot2 structure. ggplot2 structure

+
+

Adjust our ggplot

+

Now that we have a base plot that shows our data, we can add layers +on to it and adjust it. We can adjust the color of points using the +color aesthetic.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  ) # We added this argument to color code the points!
+) +
+  geom_point()
+ + +

+ + + +

Because we have so many points overlapping one another, we will want +to adjust the transparency, which we can do with an alpha +argument.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) # We are using the `alpha` argument to make our points transparent
+ + +

+ + + +

Notice that we added the alpha within the geom_point() +function, not to the aes(). We did this because we want all +of the points to have the same level of transparency, and it will not +vary depending on any variable in the data. We can also change the +background and appearance of the plot as a whole by adding a +theme.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  # Add on this set of appearance presets to make it pretty
+  theme_bw() 
+ + +

+ + + +

We are not limited to a single plotting layer. For example, if we +want to add a horizontal line to indicate a significance cutoff, we can +do that with geom_hline(). For now, we will choose the +value of 5.5 (that is close to a Bonferroni correction) and add that to +the plot.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") # we can specify colors by names here
+ + +

+ + + +

We can change the x and y labels using a few different strategies. +One approach is to use functions xlab() and +ylab() individually to set, respectively, the x-axis label +and the the y-axis label.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  # Add labels with separate functions:
+  xlab("log2 Fold Change Tumor/Normal") +
+  ylab("-log10 p value")
+ + +

+ + + +

Alternatively, we can use the ggplot2 function +labs(), which takes individual arguments for each label we +want want to set. We can also include the argument title to +add an overall plot title.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  # Add x and y labels and overall plot title with arguments to labs():
+  labs(
+    x = "log2 Fold Change Tumor/Normal",
+    y = "-log10 p value",
+    title = "Astrocytoma Tumor vs Normal Cerebellum"
+  )
+ + +

+ + + +

Something great about the labs() function is you can +also use it to specify labels for your legends derived from +certain aesthetics. In this plot, our legend is derived from a color +aesthetic, so we can specify the keyword “color” to update the +legend title.

+ + + +
ggplot(
+  tumor_normal_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  # Add x and y labels and overall plot title (and more!) with arguments to labs():
+  labs(
+    x = "log2 Fold Change Tumor/Normal",
+    y = "-log10 p value",
+    title = "Astrocytoma Tumor vs Normal Cerebellum",
+    # Use the color keyword to label the color legend
+    color = "Average expression"
+  )
+ + +

+ + + +

Use this chunk to make the same kind of plot as the previous chunk +but instead plot the male female contrast data, that is stored in +male_female_df.

+ + + +
# Use this chunk to make the same kind of volcano plot, but with the male-female contrast data.
+ggplot(
+  male_female_df,
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  labs(
+    x = "log2 Fold Change Male/Female",
+    y = "-log10 p value",
+    color = "Average expression"
+  )
+ + +

+ + + +

Turns out, we don’t have to plot each contrast separately, instead, +we can use the original data frame that contains all three contrasts’ +data, stats_df, and add a facet_wrap to make +each contrast its own plot.

+ + + +
ggplot(
+  stats_df, # Switch to the bigger data frame with all three contrasts' data
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  facet_wrap(vars(contrast)) +
+  labs(
+    # Now that this includes the other contrasts,
+    # we'll make the x-axis label more general
+    x  = "log2 Fold Change", 
+    y = "-log10 p value",
+    color = "Average expression"
+  ) +
+  coord_cartesian(xlim = c(-25, 25)) # zoom in on the x-axis
+ + +

+ + + +

We can store the plot as an object in the global environment by using +<- operator. Here we will call this +volcano_plot.

+ + + +
# We are saving this plot to a variable named `volcano_plot`
+volcano_plot <- ggplot(
+  stats_df, 
+  aes(
+    x = log_fold_change,
+    y = neg_log10_p,
+    color = avg_expression
+  )
+) +
+  geom_point(alpha = 0.2) +
+  geom_hline(yintercept = 5.5, color = "darkgreen") +
+  theme_bw() +
+  facet_wrap(vars(contrast)) +
+  labs(
+    x = "log2 Fold Change",
+    y = "-log10 p value",
+    color = "Average expression"
+  ) +
+  coord_cartesian(xlim = c(-25, 25))
+ + + +

When we are happy with our plot, we can save the plot using +ggsave. It’s a good idea to also specify width +and height arguments (units in inches) to ensure the saved +plot is always the same size every time you run this code. Here, we’ll +save a 6”x6” plot.

+ + + +
ggsave(
+  plot = volcano_plot,
+  filename = file.path(plots_dir, "volcano_plot.png"),
+  width = 6,
+  height = 6
+)
+ + + +
+
+

Session Info

+ + + +
# Print out the versions and packages we are using in this session
+sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats     graphics  grDevices utils     datasets  methods   base     
+
+other attached packages:
+ [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4    
+ [5] purrr_1.0.2     readr_2.1.5     tidyr_1.3.1     tibble_3.2.1   
+ [9] ggplot2_3.5.1   tidyverse_2.0.0 optparse_1.7.5 
+
+loaded via a namespace (and not attached):
+ [1] sass_0.4.9        utf8_1.2.4        generics_0.1.3    stringi_1.8.3    
+ [5] hms_1.1.3         digest_0.6.35     magrittr_2.0.3    evaluate_0.23    
+ [9] grid_4.4.0        timechange_0.3.0  fastmap_1.1.1     jsonlite_1.8.8   
+[13] fansi_1.0.6       scales_1.3.0      textshaping_0.3.7 getopt_1.20.4    
+[17] jquerylib_0.1.4   cli_3.6.2         crayon_1.5.2      rlang_1.1.3      
+[21] bit64_4.0.5       munsell_0.5.1     withr_3.0.0       cachem_1.0.8     
+[25] yaml_2.3.8        parallel_4.4.0    tools_4.4.0       tzdb_0.4.0       
+[29] colorspace_2.1-0  vctrs_0.6.5       R6_2.5.1          lifecycle_1.0.4  
+[33] bit_4.0.5         vroom_1.6.5       ragg_1.3.0        pkgconfig_2.0.3  
+[37] pillar_1.9.0      bslib_0.7.0       gtable_0.3.5      glue_1.7.0       
+[41] systemfonts_1.0.6 highr_0.10        xfun_0.43         tidyselect_1.2.1 
+[45] knitr_1.46        farver_2.1.1      htmltools_0.5.8.1 labeling_0.4.3   
+[49] rmarkdown_2.26    compiler_4.4.0   
+ + +
+
+ +
LS0tCnRpdGxlOiAiSW50cm9kdWN0aW9uIHRvIGdncGxvdDIiCmF1dGhvcjogIkNDREwgZm9yIEFMU0YiCmRhdGU6IDIwMjEKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6CiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCgojIyBPYmplY3RpdmVzCgpUaGlzIG5vdGVib29rIHdpbGwgZGVtb25zdHJhdGUgaG93IHRvOgoKLSBMb2FkIGFuZCB1c2UgUiBwYWNrYWdlcwotIFJlYWQgaW4gYW5kIHBlcmZvcm0gc2ltcGxlIG1hbmlwdWxhdGlvbnMgb2YgZGF0YSBmcmFtZXMKLSBVc2UgYGdncGxvdDJgIHRvIHBsb3QgYW5kIHZpc3VhbGl6ZSBkYXRhCi0gQ3VzdG9taXplIHBsb3RzIHVzaW5nIGZlYXR1cmVzIG9mIGBnZ3Bsb3QyYAoKLS0tCgpXZSdsbCB1c2UgYSByZWFsIGdlbmUgZXhwcmVzc2lvbiBkYXRhc2V0IHRvIGdldCBjb21mb3J0YWJsZSBtYWtpbmcgdmlzdWFsaXphdGlvbnMgdXNpbmcgZ2dwbG90Mi4KV2UndmUgW3BlcmZvcm1lZCBkaWZmZXJlbnRpYWwgZXhwcmVzc2lvbiBhbmFseXNlc10oLi9zY3JpcHRzLzAwLXNldHVwLWludHJvLXRvLVIuUikgb24gYSBwcmUtcHJvY2Vzc2VkIFthc3Ryb2N5dG9tYSBtaWNyb2FycmF5IGRhdGFzZXRdKGh0dHBzOi8vd3d3LnJlZmluZS5iaW8vZXhwZXJpbWVudHMvR1NFNDQ5NzEvZ2VuZS1leHByZXNzaW9uLWRhdGEtZnJvbS1waWxvY3l0aWMtYXN0cm9jeXRvbWEtdHVtb3VyLXNhbXBsZXMtYW5kLW5vcm1hbC1jZXJlYmVsbHVtLWNvbnRyb2xzKS4KV2UnbGwgc3RhcnQgYnkgbWFraW5nIGEgdm9sY2FubyBwbG90IG9mIGRpZmZlcmVudGlhbCBnZW5lIGV4cHJlc3Npb24gcmVzdWx0cyBmcm9tIHRoaXMgZXhwZXJpbWVudC4KV2UgcGVyZm9ybWVkIHRocmVlIHNldHMgb2YgY29udHJhc3RzOgoKMSkgYHNleGAgY2F0ZWdvcnkgY29udHJhc3Rpbmc6IGBNYWxlYCB2cyBgRmVtYWxlYAoyKSBgdGlzc3VlYCBjYXRlZ29yeSBjb250cmFzdGluZyA6IGBQaWxvY3l0aWMgYXN0cm9jeXRvbWEgdHVtb3JgIHNhbXBsZXMgdnMgYG5vcm1hbCBjZXJlYmVsbHVtYCBzYW1wbGVzCjMpIEFuIGludGVyYWN0aW9uIG9mIGJvdGggYHNleGAgYW5kIGB0aXNzdWVgLgoKKipNb3JlIGdncGxvdDIgcmVzb3VyY2VzOioqCgotIFtnZ3Bsb3QyIHdlYnNpdGVdKGh0dHBzOi8vZ2dwbG90Mi50aWR5dmVyc2Uub3JnLykKLSBbSGFuZHkgY2hlYXRzaGVldCBmb3IgZ2dwbG90MiAocGRmKV0oaHR0cHM6Ly9naXRodWIuY29tL3JzdHVkaW8vY2hlYXRzaGVldHMvcmF3L21haW4vZGF0YS12aXN1YWxpemF0aW9uLnBkZikKLSBbX0RhdGEgVmlzdWFsaXphdGlvbiwgQSBwcmFjdGljYWwgaW50cm9kdWN0aW9uX10oaHR0cHM6Ly9zb2N2aXouY28vKQotIFtEYXRhIHZpc3VhbGl6YXRpb24gY2hhcHRlciBvZiBfUiBmb3IgRGF0YSBTY2llbmNlX10oaHR0cHM6Ly9yNGRzLmhhZC5jby5uei9kYXRhLXZpc3VhbGlzYXRpb24uaHRtbCkKLSBbZ2dwbG90MiBvbmxpbmUgdHV0b3JpYWxdKGh0dHA6Ly9yLXN0YXRpc3RpY3MuY28vQ29tcGxldGUtR2dwbG90Mi1UdXRvcmlhbC1QYXJ0MS1XaXRoLVItQ29kZS5odG1sKQoKIyMgU2V0IFVwCgpXZSBzYXZlZCB0aGVzZSByZXN1bHRzIHRvIGEgdGFiIHNlcGFyYXRlZCB2YWx1ZXMgKFRTVikgZmlsZSBjYWxsZWQgYGdlbmVfcmVzdWx0c19HU0U0NDk3MS50c3ZgLgpJdCdzIGJlZW4gc2F2ZWQgdG8gdGhlIGBkYXRhYCBmb2xkZXIuCkZpbGUgcGF0aHMgYXJlIHJlbGF0aXZlIHRvIHdoZXJlIHRoaXMgbm90ZWJvb2sgZmlsZSAoLlJtZCkgaXMgc2F2ZWQuClNvIHdlIGNhbiByZWZlcmVuY2UgaXQgbGF0ZXIsIGxldCdzIG1ha2UgYSB2YXJpYWJsZSB3aXRoIG91ciBkYXRhIGRpcmVjdG9yeSBuYW1lLgoKYGBge3J9CmRhdGFfZGlyIDwtICJkYXRhIgpgYGAKCkxldCdzIGRlY2xhcmUgb3VyIG91dHB1dCBmb2xkZXIgbmFtZSBhcyBpdHMgb3duIHZhcmlhYmxlLgoKYGBge3J9CnBsb3RzX2RpciA8LSAicGxvdHMiCmBgYAoKV2UgY2FuIGFsc28gY3JlYXRlIGEgZGlyZWN0b3J5IGlmIGl0IGRvZXNuJ3QgYWxyZWFkeSBleGlzdC4KClRoZXJlJ3MgYSBjb3VwbGUgd2F5cyB0aGF0IHdlIGNhbiBjcmVhdGUgdGhhdCBkaXJlY3RvcnkgZnJvbSB3aXRoaW4gUi4KT25lIHdheSBpcyB0byB1c2UgdGhlIGJhc2UgUiBmdW5jdGlvbiBgZGlyLmNyZWF0ZSgpYCwgd2hpY2ggKGFzIHRoZSBuYW1lIHN1Z2dlc3RzKSB3aWxsIGNyZWF0ZSBhIGRpcmVjdG9yeS4KQnV0IHRoaXMgZnVuY3Rpb24gYXNzdW1lcyB0aGF0IHRoZSBkaXJlY3RvcnkgZG9lcyBub3QgeWV0IGV4aXN0LCBhbmQgaXQgd2lsbCB0aHJvdyBhbiBlcnJvciBpZiB5b3UgdHJ5IHRvIGNyZWF0ZSBhIGRpcmVjdG9yeSB0aGF0IGFscmVhZHkgZXhpc3RzLgpUbyBhdm9pZCB0aGlzIGVycm9yLCB3ZSBjYW4gcGxhY2UgdGhlIGRpcmVjdG9yeSBjcmVhdGlvbiBpbnNpZGUgYW4gYGlmYCBzdGF0ZW1lbnQsIHNvIHRoZSBjb2RlIHdpbGwgb25seSBydW4gaWYgdGhlIGRpcmVjdG9yeSBkb2VzIG5vdCB5ZXQgZXhpc3Q6CgpgYGB7ciBjcmVhdGVpZn0KIyBUaGUgaWYgc3RhdGVtZW50IGhlcmUgdGVzdHMgd2hldGhlciB0aGUgcGxvdCBkaXJlY3RvcnkgZXhpc3RzIGFuZAojIG9ubHkgZXhlY3V0ZXMgdGhlIGV4cHJlc3Npb25zIGJldHdlZW4gdGhlIGJyYWNlcyBpZiBpdCBkb2VzIG5vdC4KaWYgKCFkaXIuZXhpc3RzKHBsb3RzX2RpcikpIHsKICBkaXIuY3JlYXRlKHBsb3RzX2RpcikKfQpgYGAKCkluIHRoaXMgbm90ZWJvb2sgd2Ugd2lsbCBiZSB1c2luZyBmdW5jdGlvbnMgZnJvbSB0aGUgVGlkeXZlcnNlIHNldCBvZiBwYWNrYWdlcywgc28gd2UgbmVlZCB0byBsb2FkIGluIHRob3NlIGZ1bmN0aW9ucyB1c2luZyBgbGlicmFyeSgpYC4KV2UgY291bGQgbG9hZCB0aGUgaW5kaXZpZHVhbCBwYWNrYWdlcyB3ZSBuZWVkIG9uZSBhdCBhIHRpbWUsIGJ1dCBpdCBpcyBjb252ZW5pZW50IGZvciBub3cgdG8gbG9hZCB0aGVtIGFsbCB3aXRoIHRoZSBgdGlkeXZlcnNlYCAicGFja2FnZSwiIHdoaWNoIGdyb3VwcyBtYW55IG9mIHRoZW0gdG9nZXRoZXIgYXMgYSBzaG9ydGN1dC4KS2VlcCBhIGxvb2sgb3V0IGZvciB3aGVyZSB3ZSB0ZWxsIHlvdSB3aGljaCBpbmRpdmlkdWFsIHBhY2thZ2UgZGlmZmVyZW50IGZ1bmN0aW9ucyBjb21lIGZyb20uCgpgYGB7ciB0aWR5dmVyc2V9CmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMjIFJlYWQgaW4gdGhlIGRpZmZlcmVudGlhbCBleHByZXNzaW9uIGFuYWx5c2lzIHJlc3VsdHMgZmlsZQoKSGVyZSB3ZSBhcmUgdXNpbmcgYSBgdGlkeXZlcnNlYCBmdW5jdGlvbiBgcmVhZF90c3YoKWAgZnJvbSB0aGUgYHJlYWRyYCBwYWNrYWdlLgpMaWtlIHdlIGRpZCBpbiB0aGUgcHJldmlvdXMgbm90ZWJvb2ssIHdlIHdpbGwgc3RvcmUgdGhlIHJlc3VsdGluZyBkYXRhIGZyYW1lIGFzIGBzdGF0c19kZmAuCgpgYGB7ciByZWFkLXN0YXRzfQojIHJlYWQgaW4gdGhlIGZpbGUgYGdlbmVfcmVzdWx0c19HU0U0NDk3MS50c3ZgIGZyb20gdGhlIGRhdGEgZGlyZWN0b3J5CnN0YXRzX2RmIDwtIHJlYWRfdHN2KGZpbGUucGF0aCgKICBkYXRhX2RpciwKICAiZ2VuZV9yZXN1bHRzX0dTRTQ0OTcxLnRzdiIKKSkKYGBgCgpXZSBjYW4gdGFrZSBhIGxvb2sgYXQgYSBjb2x1bW4gaW5kaXZpZHVhbGx5IGJ5IHVzaW5nIGEgYCRgLgpOb3RlIHdlIGFyZSB1c2luZyBgaGVhZCgpYCBzbyB0aGUgd2hvbGUgdGhpbmcgZG9lc24ndCBwcmludCBvdXQuCgpgYGB7ciBjb2x1bW59CmhlYWQoc3RhdHNfZGYkY29udHJhc3QpCmBgYAoKSWYgd2Ugd2FudCB0byBzZWUgYSBzcGVjaWZpYyBzZXQgb2YgdmFsdWVzLCB3ZSBjYW4gdXNlIGJyYWNrZXRzIHdpdGggdGhlIGluZGljZXMgb2YgdGhlIHZhbHVlcyB3ZSdkIGxpa2UgcmV0dXJuZWQuCgpgYGB7cn0Kc3RhdHNfZGYkYXZnX2V4cHJlc3Npb25bNjoxMF0KYGBgCgpMZXQncyBsb29rIGF0IHNvbWUgYmFzaWMgc3RhdGlzdGljcyBmcm9tIHRoZSBkYXRhIHNldCB1c2luZyBgc3VtbWFyeSgpYAoKYGBge3Igc3RhdHMtc3VtbWFyeSwgbGl2ZSA9IFRSVUV9CiMgc3VtbWFyeSBvZiBzdGF0c19kZgpzdW1tYXJ5KHN0YXRzX2RmKQpgYGAKClRoZSBzdGF0aXN0aWNzIGZvciBgY29udHJhc3RgIGFyZSBub3QgdmVyeSBpbmZvcm1hdGl2ZSwgc28gbGV0J3MgZG8gdGhhdCBhZ2FpbiB3aXRoIGp1c3QgdGhlIGBjb250cmFzdGAgY29sdW1uIGFmdGVyIGNvbnZlcnRpbmcgaXQgdG8gYSBgZmFjdG9yYApgYGB7ciBmYWN0b3Itc3VtbWFyeSwgbGl2ZSA9IFRSVUV9CiMgc3VtbWFyeSBvZiBgc3RhdHNfZGYkY29udHJhc3RgIGFzIGEgZmFjdG9yCnN1bW1hcnkoYXMuZmFjdG9yKHN0YXRzX2RmJGNvbnRyYXN0KSkKYGBgCgojIyBTZXQgdXAgdGhlIGRhdGFzZXQKCkJlZm9yZSB3ZSBtYWtlIG91ciBwbG90LCB3ZSB3YW50IHRvIGNhbGN1bGF0ZSBhIHNldCBvZiBuZXcgdmFsdWVzIGZvciBlYWNoIHJvdzsgdHJhbnNmb3JtYXRpb25zIG9mIHRoZSByYXcgc3RhdGlzdGljcyBpbiBvdXIgdGFibGUuClRvIGRvIHRoaXMgd2Ugd2lsbCB1c2UgYSBmdW5jdGlvbiBmcm9tIHRoZSBgZHBseXJgIHBhY2thZ2UgY2FsbGVkIGBtdXRhdGUoKWAgdG8gbWFrZSBhIG5ldyBjb2x1bW4gb2YgLWxvZzEwIHAgdmFsdWVzLgoKYGBge3IgbXV0YXRlfQojIGFkZCBhIGBuZWdfbG9nMTBfcGAgY29sdW1uIHRvIHRoZSBkYXRhIGZyYW1lCnN0YXRzX2RmIDwtIG11dGF0ZShzdGF0c19kZiwgIyBkYXRhIGZyYW1lIHdlJ2QgbGlrZSB0byBhZGQgYSB2YXJpYWJsZSB0bwogIG5lZ19sb2cxMF9wID0gLWxvZzEwKHBfdmFsdWUpICMgY29sdW1uIG5hbWUgYW5kIHZhbHVlcwopCmBgYAoKTGV0J3MgZmlsdGVyIHRvIG9ubHkgYG1hbGVfZmVtYWxlYCBjb250cmFzdCBkYXRhLgpGaXJzdCBsZXQncyB0cnkgb3V0IGEgbG9naWNhbCBleHByZXNzaW9uOgoKYGBge3IgZXZhbCA9IEZBTFNFfQpzdGF0c19kZiRjb250cmFzdCA9PSAibWFsZV9mZW1hbGUiCmBgYAoKTm93IHdlIGNhbiB0cnkgb3V0IHRoZSBgZmlsdGVyKClgIGZ1bmN0aW9uLgpOb3RpY2UgdGhhdCB3ZSBhcmUgbm90IGFzc2lnbmluZyB0aGUgcmVzdWx0cyB0byBhIHZhcmlhYmxlLCBzbyB0aGlzIGZpbHRlcmVkIGRhdGFzZXQgd2lsbCBub3QgYmUgc2F2ZWQgdG8gdGhlIGVudmlyb25tZW50LgoKYGBge3IgZmlsdGVyLCBsaXZlID0gVFJVRX0KIyBmaWx0ZXIgc3RhdHNfZGYgdG8gIm1hbGVfZmVtYWxlIiBvbmx5CmZpbHRlcihzdGF0c19kZiwgY29udHJhc3QgPT0gIm1hbGVfZmVtYWxlIikKYGBgCgpOb3cgd2UgY2FuIGFzc2lnbiB0aGUgcmVzdWx0cyB0byBhIG5ldyBkYXRhIGZyYW1lOiBgbWFsZV9mZW1hbGVfZGZgLgoKYGBge3IgZmlsdGVyLXNhdmUsIGxpdmUgPSBUUlVFfQojIGZpbHRlciBhbmQgc2F2ZSB0byBtYWxlX2ZlbWFsZV9kZgptYWxlX2ZlbWFsZV9kZiA8LSBmaWx0ZXIoc3RhdHNfZGYsIGNvbnRyYXN0ID09ICJtYWxlX2ZlbWFsZSIpCmBgYAoKIyMgUGxvdHRpbmcgdGhpcyBkYXRhCgpMZXQncyBtYWtlIGEgdm9sY2FubyBwbG90IHdpdGggdGhpcyBkYXRhLgpGaXJzdCBsZXQncyB0YWtlIGEgbG9vayBhdCBvbmx5IHRoZSB0dW1vciB2cy4gbm9ybWFsIGNvbXBhcmlzb24uCkxldCdzIHNhdmUgdGhpcyBhcyBhIHNlcGFyYXRlIGRhdGEgZnJhbWUgYnkgYXNzaWduaW5nIGl0IGEgbmV3IG5hbWUuCgpgYGB7ciBmaWx0ZXItdHVtb3J9CnR1bW9yX25vcm1hbF9kZiA8LSBmaWx0ZXIoc3RhdHNfZGYsIGNvbnRyYXN0ID09ICJhc3Ryb2N5dG9tYV9ub3JtYWwiKQpgYGAKClRvIG1ha2UgdGhpcyBwbG90IHdlIHdpbGwgYmUgdXNpbmcgZnVuY3Rpb25zIGZyb20gdGhlIGBnZ3Bsb3QyYCBwYWNrYWdlLCB0aGUgbWFpbiBwbG90dGluZyBwYWNrYWdlIG9mIHRoZSB0aWR5dmVyc2UuCldlIHVzZSB0aGUgZmlyc3QgZnVuY3Rpb24sIGBnZ3Bsb3QoKWAgdG8gZGVmaW5lIHRoZSBkYXRhIHRoYXQgd2lsbCBiZSBwbG90dGVkLgpSZW1lbWJlciwgdGhlIG5hbWUgb2YgdGhpcyBwYWNrYWdlIGlzIGBnZ3Bsb3QyYCwgYnV0IHRoZSBmdW5jdGlvbiB3ZSB1c2UgaXMgY2FsbGVkIGBnZ3Bsb3QoKWAgd2l0aG91dCB0aGUgYDJgLgpgZ2dwbG90KClgIHRha2VzIHR3byBtYWluIGFyZ3VtZW50czoKCjEuIGBkYXRhYCwgd2hpY2ggaXMgdGhlIGRhdGEgZnJhbWUgdGhhdCBjb250YWlucyB0aGUgZGF0YSB3ZSB3YW50IHRvIHBsb3QuCjIuIGBtYXBwaW5nYCwgd2hpY2ggaXMgYSBzcGVjaWFsIGxpc3QgbWFkZSB3aXRoIHRoZSBgYWVzKClgIGZ1bmN0aW9uIHRvIGRlc2NyaWJlIHdoaWNoIHZhbHVlcyB3aWxsIGJlIHVzZWQgZm9yIGVhY2ggKiphZXMqKnRoZXRpYyBjb21wb25lbnQgb2YgdGhlIHBsb3QsIHN1Y2ggYXMgdGhlIHggYW5kIHkgY29vcmRpbmF0ZXMgb2YgZWFjaCBwb2ludC4KKElmIHlvdSBmaW5kIGNhbGxpbmcgdGhpbmdzIGxpa2UgdGhlIHggYW5kIHkgY29vcmRpbmF0ZXMgImFlc3RoZXRpY3MiIGNvbmZ1c2luZywgZG9uJ3Qgd29ycnksIHlvdSBhcmUgbm90IGFsb25lLikKU3BlY2lmaWNhbGx5LCB0aGUgYGFlcygpYCBmdW5jdGlvbiBpcyB1c2VkIHRvIHNwZWNpZnkgdGhhdCBhIGdpdmVuIGNvbHVtbiAodmFyaWFibGUpIGluIHlvdXIgZGF0YSBmcmFtZSBiZSBtYXBwZWQgdG8gYSBnaXZlbiBhZXN0aGV0aWMgY29tcG9uZW50IG9mIHRoZSBwbG90LgoKCmBgYHtyIGdncGxvdC1iYXNlfQpnZ3Bsb3QoCiAgdHVtb3Jfbm9ybWFsX2RmLCAjIFRoaXMgZmlyc3QgYXJndW1lbnQgaXMgdGhlIGRhdGEgZnJhbWUgd2l0aCB0aGUgZGF0YSB3ZSB3YW50IHRvIHBsb3QKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLCAjIFRoaXMgaXMgdGhlIGNvbHVtbiBuYW1lIG9mIHRoZSB2YWx1ZXMgd2Ugd2FudCB0byB1c2UKICAgICMgZm9yIHRoZSB4IGNvb3JkaW5hdGVzCiAgICB5ID0gbmVnX2xvZzEwX3AKICApICMgVGhpcyBpcyB0aGUgY29sdW1uIG5hbWUgb2YgdGhlIGRhdGEgd2Ugd2FudCBmb3IgdGhlIHktYXhpcwopCmBgYAoKWW91J2xsIG5vdGljZSB0aGlzIHBsb3QgZG9lc24ndCBoYXZlIGFueXRoaW5nIG9uIGl0IGJlY2F1c2Ugd2UgaGF2ZW4ndApzcGVjaWZpZWQgYSBwbG90IHR5cGUgeWV0LgpUbyBkbyB0aGF0LCB3ZSB3aWxsIGFkZCBhbm90aGVyIGdncGxvdCBsYXllciB3aXRoIGArYCB3aGljaCB3aWxsIHNwZWNpZnkgZXhhY3RseSB3aGF0IHdlIHdhbnQgdG8gcGxvdC4KQSB2b2xjYW5vIHBsb3QgaXMgYSBzcGVjaWFsIGtpbmQgb2Ygc2NhdHRlciBwbG90LCBzbyB0byBtYWtlIHRoYXQgd2Ugd2lsbCB3YW50IHRvIHBsb3QgaW5kaXZpZHVhbCBwb2ludHMsIHdoaWNoIHdlIGNhbiBkbyB3aXRoIGBnZW9tX3BvaW50KClgLgoKYGBge3IgZ2dwbG90LXBvaW50cywgbGl2ZSA9IFRSVUV9CiMgVGhpcyBmaXJzdCBwYXJ0IGlzIHRoZSBzYW1lIGFzIGJlZm9yZQpnZ3Bsb3QoCiAgdHVtb3Jfbm9ybWFsX2RmLAogIGFlcygKICAgIHggPSBsb2dfZm9sZF9jaGFuZ2UsCiAgICB5ID0gbmVnX2xvZzEwX3AKICApCikgKwogICMgTm93IHdlIGFyZSBhZGRpbmcgb24gYSBsYXllciB0byBzcGVjaWZ5IHdoYXQga2luZCBvZiBwbG90IHdlIHdhbnQKICBnZW9tX3BvaW50KCkKYGBgCgpIZXJlJ3MgYSBicmllZiBzdW1tYXJ5IG9mIGdncGxvdDIgc3RydWN0dXJlLgohW2dncGxvdDIgc3RydWN0dXJlXShkaWFncmFtcy9nZ3Bsb3Rfc3RydWN0dXJlLnBuZykKCiMjIyBBZGp1c3Qgb3VyIGdncGxvdAoKTm93IHRoYXQgd2UgaGF2ZSBhIGJhc2UgcGxvdCB0aGF0IHNob3dzIG91ciBkYXRhLCB3ZSBjYW4gYWRkIGxheWVycyBvbiB0byBpdCBhbmQgYWRqdXN0IGl0LgpXZSBjYW4gYWRqdXN0IHRoZSBjb2xvciBvZiBwb2ludHMgdXNpbmcgdGhlIGBjb2xvcmAgYWVzdGhldGljLgoKYGBge3IgZ2dwbG90LWNvbG9yLCBsaXZlID0gVFJVRX0KZ2dwbG90KAogIHR1bW9yX25vcm1hbF9kZiwKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkgIyBXZSBhZGRlZCB0aGlzIGFyZ3VtZW50IHRvIGNvbG9yIGNvZGUgdGhlIHBvaW50cyEKKSArCiAgZ2VvbV9wb2ludCgpCmBgYAoKQmVjYXVzZSB3ZSBoYXZlIHNvIG1hbnkgcG9pbnRzIG92ZXJsYXBwaW5nIG9uZSBhbm90aGVyLCB3ZSB3aWxsIHdhbnQgdG8gYWRqdXN0CnRoZSB0cmFuc3BhcmVuY3ksIHdoaWNoIHdlIGNhbiBkbyB3aXRoIGFuIGBhbHBoYWAgYXJndW1lbnQuCgpgYGB7ciBnZ3Bsb3QtYWxwaGEsIGxpdmUgPSBUUlVFfQpnZ3Bsb3QoCiAgdHVtb3Jfbm9ybWFsX2RmLAogIGFlcygKICAgIHggPSBsb2dfZm9sZF9jaGFuZ2UsCiAgICB5ID0gbmVnX2xvZzEwX3AsCiAgICBjb2xvciA9IGF2Z19leHByZXNzaW9uCiAgKQopICsKICBnZW9tX3BvaW50KGFscGhhID0gMC4yKSAjIFdlIGFyZSB1c2luZyB0aGUgYGFscGhhYCBhcmd1bWVudCB0byBtYWtlIG91ciBwb2ludHMgdHJhbnNwYXJlbnQKYGBgCgpOb3RpY2UgdGhhdCB3ZSBhZGRlZCB0aGUgYWxwaGEgd2l0aGluIHRoZSBgZ2VvbV9wb2ludCgpYCBmdW5jdGlvbiwgbm90IHRvIHRoZSBgYWVzKClgLgpXZSBkaWQgdGhpcyBiZWNhdXNlIHdlIHdhbnQgYWxsIG9mIHRoZSBwb2ludHMgdG8gaGF2ZSB0aGUgc2FtZSBsZXZlbCBvZiB0cmFuc3BhcmVuY3ksIGFuZCBpdCB3aWxsIG5vdCB2YXJ5IGRlcGVuZGluZyBvbiBhbnkgdmFyaWFibGUgaW4gdGhlIGRhdGEuCldlIGNhbiBhbHNvIGNoYW5nZSB0aGUgYmFja2dyb3VuZCBhbmQgYXBwZWFyYW5jZSBvZiB0aGUgcGxvdCBhcyBhIHdob2xlIGJ5IGFkZGluZyBhIGB0aGVtZWAuCgpgYGB7ciBnZ3Bsb3QtdGhlbWV9CmdncGxvdCgKICB0dW1vcl9ub3JtYWxfZGYsCiAgYWVzKAogICAgeCA9IGxvZ19mb2xkX2NoYW5nZSwKICAgIHkgPSBuZWdfbG9nMTBfcCwKICAgIGNvbG9yID0gYXZnX2V4cHJlc3Npb24KICApCikgKwogIGdlb21fcG9pbnQoYWxwaGEgPSAwLjIpICsKICAjIEFkZCBvbiB0aGlzIHNldCBvZiBhcHBlYXJhbmNlIHByZXNldHMgdG8gbWFrZSBpdCBwcmV0dHkKICB0aGVtZV9idygpIApgYGAKCldlIGFyZSBub3QgbGltaXRlZCB0byBhIHNpbmdsZSBwbG90dGluZyBsYXllci4KRm9yIGV4YW1wbGUsIGlmIHdlIHdhbnQgdG8gYWRkIGEgaG9yaXpvbnRhbCBsaW5lIHRvIGluZGljYXRlIGEgc2lnbmlmaWNhbmNlIGN1dG9mZiwgd2UgY2FuIGRvIHRoYXQgd2l0aCBgZ2VvbV9obGluZSgpYC4KRm9yIG5vdywgd2Ugd2lsbCBjaG9vc2UgdGhlIHZhbHVlIG9mIDUuNSAodGhhdCBpcyBjbG9zZSB0byBhIEJvbmZlcnJvbmkgY29ycmVjdGlvbikgYW5kIGFkZCB0aGF0IHRvIHRoZSBwbG90LgoKYGBge3IgZ2dwbG90LWhsaW5lLCBsaXZlID0gVFJVRX0KZ2dwbG90KAogIHR1bW9yX25vcm1hbF9kZiwKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgIyB3ZSBjYW4gc3BlY2lmeSBjb2xvcnMgYnkgbmFtZXMgaGVyZQpgYGAKCldlIGNhbiBjaGFuZ2UgdGhlIHggYW5kIHkgbGFiZWxzIHVzaW5nIGEgZmV3IGRpZmZlcmVudCBzdHJhdGVnaWVzLgpPbmUgYXBwcm9hY2ggaXMgdG8gdXNlIGZ1bmN0aW9ucyBgeGxhYigpYCBhbmQgYHlsYWIoKWAgaW5kaXZpZHVhbGx5IHRvIHNldCwgcmVzcGVjdGl2ZWx5LCB0aGUgeC1heGlzIGxhYmVsIGFuZCB0aGUgdGhlIHktYXhpcyBsYWJlbC4KCgpgYGB7ciBnZ3Bsb3QtbGFiZWwtMX0KZ2dwbG90KAogIHR1bW9yX25vcm1hbF9kZiwKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgKwogIHRoZW1lX2J3KCkgKwogICMgQWRkIGxhYmVscyB3aXRoIHNlcGFyYXRlIGZ1bmN0aW9uczoKICB4bGFiKCJsb2cyIEZvbGQgQ2hhbmdlIFR1bW9yL05vcm1hbCIpICsKICB5bGFiKCItbG9nMTAgcCB2YWx1ZSIpCmBgYAoKCkFsdGVybmF0aXZlbHksIHdlIGNhbiB1c2UgdGhlIGBnZ3Bsb3QyYCBmdW5jdGlvbiBgbGFicygpYCwgd2hpY2ggdGFrZXMgaW5kaXZpZHVhbCBhcmd1bWVudHMgZm9yIGVhY2ggbGFiZWwgd2Ugd2FudCB3YW50IHRvIHNldC4KV2UgY2FuIGFsc28gaW5jbHVkZSB0aGUgYXJndW1lbnQgYHRpdGxlYCB0byBhZGQgYW4gb3ZlcmFsbCBwbG90IHRpdGxlLgoKYGBge3IgZ2dwbG90LWxhYmVsLTIsIGxpdmUgPSBUUlVFfQpnZ3Bsb3QoCiAgdHVtb3Jfbm9ybWFsX2RmLAogIGFlcygKICAgIHggPSBsb2dfZm9sZF9jaGFuZ2UsCiAgICB5ID0gbmVnX2xvZzEwX3AsCiAgICBjb2xvciA9IGF2Z19leHByZXNzaW9uCiAgKQopICsKICBnZW9tX3BvaW50KGFscGhhID0gMC4yKSArCiAgZ2VvbV9obGluZSh5aW50ZXJjZXB0ID0gNS41LCBjb2xvciA9ICJkYXJrZ3JlZW4iKSArCiAgdGhlbWVfYncoKSArCiAgIyBBZGQgeCBhbmQgeSBsYWJlbHMgYW5kIG92ZXJhbGwgcGxvdCB0aXRsZSB3aXRoIGFyZ3VtZW50cyB0byBsYWJzKCk6CiAgbGFicygKICAgIHggPSAibG9nMiBGb2xkIENoYW5nZSBUdW1vci9Ob3JtYWwiLAogICAgeSA9ICItbG9nMTAgcCB2YWx1ZSIsCiAgICB0aXRsZSA9ICJBc3Ryb2N5dG9tYSBUdW1vciB2cyBOb3JtYWwgQ2VyZWJlbGx1bSIKICApCgpgYGAKClNvbWV0aGluZyBncmVhdCBhYm91dCB0aGUgYGxhYnMoKWAgZnVuY3Rpb24gaXMgeW91IGNhbiBhbHNvIHVzZSBpdCB0byBzcGVjaWZ5IGxhYmVscyBmb3IgeW91ciAqbGVnZW5kcyogZGVyaXZlZCBmcm9tIGNlcnRhaW4gYWVzdGhldGljcy4KSW4gdGhpcyBwbG90LCBvdXIgbGVnZW5kIGlzIGRlcml2ZWQgZnJvbSBhICpjb2xvciBhZXN0aGV0aWMqLCBzbyB3ZSBjYW4gc3BlY2lmeSB0aGUga2V5d29yZCAiY29sb3IiIHRvIHVwZGF0ZSB0aGUgbGVnZW5kIHRpdGxlLgoKYGBge3IgZ2dwbG90LWxhYmVsLWFlc30KZ2dwbG90KAogIHR1bW9yX25vcm1hbF9kZiwKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgKwogIHRoZW1lX2J3KCkgKwogICMgQWRkIHggYW5kIHkgbGFiZWxzIGFuZCBvdmVyYWxsIHBsb3QgdGl0bGUgKGFuZCBtb3JlISkgd2l0aCBhcmd1bWVudHMgdG8gbGFicygpOgogIGxhYnMoCiAgICB4ID0gImxvZzIgRm9sZCBDaGFuZ2UgVHVtb3IvTm9ybWFsIiwKICAgIHkgPSAiLWxvZzEwIHAgdmFsdWUiLAogICAgdGl0bGUgPSAiQXN0cm9jeXRvbWEgVHVtb3IgdnMgTm9ybWFsIENlcmViZWxsdW0iLAogICAgIyBVc2UgdGhlIGNvbG9yIGtleXdvcmQgdG8gbGFiZWwgdGhlIGNvbG9yIGxlZ2VuZAogICAgY29sb3IgPSAiQXZlcmFnZSBleHByZXNzaW9uIgogICkKCmBgYAoKClVzZSB0aGlzIGNodW5rIHRvIG1ha2UgdGhlIHNhbWUga2luZCBvZiBwbG90IGFzIHRoZSBwcmV2aW91cyBjaHVuayBidXQgaW5zdGVhZCBwbG90IHRoZSBtYWxlIGZlbWFsZSBjb250cmFzdCBkYXRhLCB0aGF0IGlzIHN0b3JlZCBpbiBgbWFsZV9mZW1hbGVfZGZgLgoKYGBge3IgbWYtdm9sY2FubywgbGl2ZSA9IFRSVUV9CiMgVXNlIHRoaXMgY2h1bmsgdG8gbWFrZSB0aGUgc2FtZSBraW5kIG9mIHZvbGNhbm8gcGxvdCwgYnV0IHdpdGggdGhlIG1hbGUtZmVtYWxlIGNvbnRyYXN0IGRhdGEuCmdncGxvdCgKICBtYWxlX2ZlbWFsZV9kZiwKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgKwogIHRoZW1lX2J3KCkgKwogIGxhYnMoCiAgICB4ID0gImxvZzIgRm9sZCBDaGFuZ2UgTWFsZS9GZW1hbGUiLAogICAgeSA9ICItbG9nMTAgcCB2YWx1ZSIsCiAgICBjb2xvciA9ICJBdmVyYWdlIGV4cHJlc3Npb24iCiAgKQpgYGAKCgpUdXJucyBvdXQsIHdlIGRvbid0IGhhdmUgdG8gcGxvdCBlYWNoIGNvbnRyYXN0IHNlcGFyYXRlbHksIGluc3RlYWQsIHdlIGNhbiB1c2UgdGhlIG9yaWdpbmFsIGRhdGEgZnJhbWUgdGhhdCBjb250YWlucyBhbGwgdGhyZWUgY29udHJhc3RzJyBkYXRhLCBgc3RhdHNfZGZgLCBhbmQgYWRkIGEgYGZhY2V0X3dyYXBgIHRvIG1ha2UgZWFjaCBjb250cmFzdCBpdHMgb3duIHBsb3QuCgpgYGB7ciBnZ3Bsb3QtZmFjZXRzfQpnZ3Bsb3QoCiAgc3RhdHNfZGYsICMgU3dpdGNoIHRvIHRoZSBiaWdnZXIgZGF0YSBmcmFtZSB3aXRoIGFsbCB0aHJlZSBjb250cmFzdHMnIGRhdGEKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgKwogIHRoZW1lX2J3KCkgKwogIGZhY2V0X3dyYXAodmFycyhjb250cmFzdCkpICsKICBsYWJzKAogICAgIyBOb3cgdGhhdCB0aGlzIGluY2x1ZGVzIHRoZSBvdGhlciBjb250cmFzdHMsCiAgICAjIHdlJ2xsIG1ha2UgdGhlIHgtYXhpcyBsYWJlbCBtb3JlIGdlbmVyYWwKICAgIHggID0gImxvZzIgRm9sZCBDaGFuZ2UiLCAKICAgIHkgPSAiLWxvZzEwIHAgdmFsdWUiLAogICAgY29sb3IgPSAiQXZlcmFnZSBleHByZXNzaW9uIgogICkgKwogIGNvb3JkX2NhcnRlc2lhbih4bGltID0gYygtMjUsIDI1KSkgIyB6b29tIGluIG9uIHRoZSB4LWF4aXMKYGBgCgpXZSBjYW4gc3RvcmUgdGhlIHBsb3QgYXMgYW4gb2JqZWN0IGluIHRoZSBnbG9iYWwgZW52aXJvbm1lbnQgYnkgdXNpbmcgYDwtYCBvcGVyYXRvci4KSGVyZSB3ZSB3aWxsIGNhbGwgdGhpcyBgdm9sY2Fub19wbG90YC4KCmBgYHtyIGdncGxvdC1zdG9yZS1vYmplY3R9CiMgV2UgYXJlIHNhdmluZyB0aGlzIHBsb3QgdG8gYSB2YXJpYWJsZSBuYW1lZCBgdm9sY2Fub19wbG90YAp2b2xjYW5vX3Bsb3QgPC0gZ2dwbG90KAogIHN0YXRzX2RmLCAKICBhZXMoCiAgICB4ID0gbG9nX2ZvbGRfY2hhbmdlLAogICAgeSA9IG5lZ19sb2cxMF9wLAogICAgY29sb3IgPSBhdmdfZXhwcmVzc2lvbgogICkKKSArCiAgZ2VvbV9wb2ludChhbHBoYSA9IDAuMikgKwogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDUuNSwgY29sb3IgPSAiZGFya2dyZWVuIikgKwogIHRoZW1lX2J3KCkgKwogIGZhY2V0X3dyYXAodmFycyhjb250cmFzdCkpICsKICBsYWJzKAogICAgeCA9ICJsb2cyIEZvbGQgQ2hhbmdlIiwKICAgIHkgPSAiLWxvZzEwIHAgdmFsdWUiLAogICAgY29sb3IgPSAiQXZlcmFnZSBleHByZXNzaW9uIgogICkgKwogIGNvb3JkX2NhcnRlc2lhbih4bGltID0gYygtMjUsIDI1KSkKYGBgCgpXaGVuIHdlIGFyZSBoYXBweSB3aXRoIG91ciBwbG90LCB3ZSBjYW4gc2F2ZSB0aGUgcGxvdCB1c2luZyBgZ2dzYXZlYC4KSXQncyBhIGdvb2QgaWRlYSB0byBhbHNvIHNwZWNpZnkgYHdpZHRoYCBhbmQgYGhlaWdodGAgYXJndW1lbnRzICh1bml0cyBpbiBpbmNoZXMpCnRvIGVuc3VyZSB0aGUgc2F2ZWQgcGxvdCBpcyBhbHdheXMgdGhlIHNhbWUgc2l6ZSBldmVyeSB0aW1lIHlvdSBydW4gdGhpcyBjb2RlLgpIZXJlLCB3ZSdsbCBzYXZlIGEgNiJ4NiIgcGxvdC4KCgpgYGB7ciBnZ3NhdmV9Cmdnc2F2ZSgKICBwbG90ID0gdm9sY2Fub19wbG90LAogIGZpbGVuYW1lID0gZmlsZS5wYXRoKHBsb3RzX2RpciwgInZvbGNhbm9fcGxvdC5wbmciKSwKICB3aWR0aCA9IDYsCiAgaGVpZ2h0ID0gNgopCmBgYAoKIyMjIFNlc3Npb24gSW5mbwoKYGBge3J9CiMgUHJpbnQgb3V0IHRoZSB2ZXJzaW9ucyBhbmQgcGFja2FnZXMgd2UgYXJlIHVzaW5nIGluIHRoaXMgc2Vzc2lvbgpzZXNzaW9uSW5mbygpCmBgYAo=
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/intro-to-R-tidyverse/03-intro_to_tidyverse.nb.html b/completed-notebooks/intro-to-R-tidyverse/03-intro_to_tidyverse.nb.html new file mode 100644 index 0000000..b2459ee --- /dev/null +++ b/completed-notebooks/intro-to-R-tidyverse/03-intro_to_tidyverse.nb.html @@ -0,0 +1,3971 @@ + + + + + + + + + + + + + + + +Introduction to tidyverse + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Use functions from the tidyverse to read and write data frames
  • +
  • Implement and use tidyverse functions to wrangle data (i.e. filter, +mutate, arrange, join)
  • +
  • Use R pipes (|>) to combine multiple operations
  • +
  • Use the apply() function to apply functions across rows +or columns of a matrix
  • +
+
+

We’ll use the same gene expression dataset we used in the previous notebook. It is a +pre-processed astrocytoma +microarray dataset that we performed a set of differential expression analyses +on.

+

More tidyverse resources:

+ +
+
+

Set Up

+

The tidyverse is a collection of packages that are handy for general +data wrangling, analysis, and visualization. Other packages that are +specifically handy for different biological analyses are found on Bioconductor. If we want to use +a package’s functions we first need to install them.

+

Our RStudio Server already has the tidyverse group of +packages installed for you. But if you needed to install it or other +packages available on CRAN, you do it using the +install.packages() function like this: +install.packages("tidyverse").

+ + + +
library(tidyverse)
+ + +
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
+✔ dplyr     1.1.4     ✔ readr     2.1.5
+✔ forcats   1.0.0     ✔ stringr   1.5.1
+✔ ggplot2   3.5.1     ✔ tibble    3.2.1
+✔ lubridate 1.9.3     ✔ tidyr     1.3.1
+✔ purrr     1.0.2     
+── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
+✖ dplyr::filter() masks stats::filter()
+✖ dplyr::lag()    masks stats::lag()
+ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
+ + + +
+

Referencing a library’s function with ::

+

Note that if we had not imported the tidyverse set of packages using +library() like above, and we wanted to use a tidyverse +function like read_tsv(), we would need to tell R what +package to find this function in. To do this, we would use +:: to tell R to load in this function from the +readr package by using readr::read_tsv(). You +will see this :: method of referencing libraries within +packages throughout the course. We like to use it in part to remove any +ambiguity in which version of a function we are using; it is not too +uncommon for different packages to use the same name for very different +functions!

+
+
+
+

Managing directories

+

Before we can import the data we need, we should double check where R +is looking for files, aka the current working +directory. We can do this by using the getwd() +function, which will tell us what folder we are in.

+ + + +
# Let's check what directory we are in:
+getwd()
+ + +
[1] "/__w/training-modules/training-modules/intro-to-R-tidyverse"
+ + +
/__w/training-modules/training-modules/intro-to-R-tidyverse
+ + + +

For Rmd files, the working directory is wherever the file is located, +but commands executed in the console may have a different working +directory.

+

We will want to make a directory for our output and we will call this +directory: results. But before we create the directory, we +should check if it already exists. We will show two ways that we can do +this.

+

First, we can use the dir() function to have R list the +files in our working directory.

+ + + +
# Let's check what files are here
+dir()
+ + +
 [1] "00a-rstudio_guide.Rmd"                   
+ [2] "00b-debugging_resources.Rmd"             
+ [3] "00c-good_scientific_coding_practices.Rmd"
+ [4] "01-intro_to_base_R-live.Rmd"             
+ [5] "01-intro_to_base_R.nb.html"              
+ [6] "01-intro_to_base_R.Rmd"                  
+ [7] "02-intro_to_ggplot2-live.Rmd"            
+ [8] "02-intro_to_ggplot2.nb.html"             
+ [9] "02-intro_to_ggplot2.Rmd"                 
+[10] "03-intro_to_tidyverse-live.Rmd"          
+[11] "03-intro_to_tidyverse.nb.html"           
+[12] "03-intro_to_tidyverse.Rmd"               
+[13] "data"                                    
+[14] "diagrams"                                
+[15] "exercise_01-intro_to_base_R.Rmd"         
+[16] "exercise_02-intro_to_R.Rmd"              
+[17] "exercise_03a-intro_to_tidyverse.Rmd"     
+[18] "exercise_03b-intro_to_tidyverse.Rmd"     
+[19] "plots"                                   
+[20] "README.md"                               
+[21] "screenshots"                             
+[22] "scripts"                                 
+ + +
00a-rstudio_guide.Rmd
+00b-debugging_resources.Rmd
+00c-good_scientific_coding_practices.Rmd
+01-intro_to_base_R-live.Rmd
+01-intro_to_base_R.nb.html
+01-intro_to_base_R.Rmd
+02-intro_to_ggplot2-live.Rmd
+02-intro_to_ggplot2.nb.html
+02-intro_to_ggplot2.Rmd
+03-intro_to_tidyverse-live.Rmd
+03-intro_to_tidyverse.nb.html
+03-intro_to_tidyverse.Rmd
+data
+diagrams
+exercise_01-intro_to_base_R.Rmd
+exercise_02-intro_to_R.Rmd
+exercise_03a-intro_to_tidyverse.Rmd
+exercise_03b-intro_to_tidyverse.Rmd
+plots
+README.md
+screenshots
+scripts
+ + + +

This shows us there is no folder called “results” yet.

+

If we want to more pointedly look for “results” in our working +directory we can use the dir.exists() function.

+ + + +
# Check if the results directory exists
+dir.exists("results")
+ + +
[1] FALSE
+ + + +

If the above says FALSE that means we will need to +create a results directory. We’ve previously seen that we +can make directories in R using the base R function +dir.create(). But we’ve also seen that this function will +throw an error if you try to create a directory that already exists, +which can be frustrating if you are re-running code! A different option +is to use the fs +package, which provides functions for you to interact with your +computer’s file system with a more consistent behavior than the base R +functions. One function from this package is +fs::dir_create() (note that it has an underscore, +not a period), and much like the base R dir.create(), it +creates directories. It has some other helpful features too: - It will +simply do nothing if that directory already exists; no errors, and +nothing will get overwritten - It allows creating nested +directories by default, i.e. in one call make directories inside of +other directories

+

Let’s go ahead and use it to create our results +directory:

+ + + +
# Make a directory within the working directory called 'results'
+fs::dir_create("results")
+ + + +

After creating the results directory above, let’s re-run +dir.exists() to see if now it exists.

+ + + +
# Re-check if the results directory exists
+dir.exists("results")
+ + +
[1] TRUE
+ + + +

The dir.exists() function will not work on files +themselves. In that case, there is an analogous function called +file.exists().

+

Try using the file.exists() function to see if the file +gene_results_GSE44971.tsv exists in the current directory. +Use the code chunk we set up for you below. Note that in our notebooks +(and sometimes elsewhere), wherever you see a +<FILL_IN_THE_BLANK> like in the chunk below, that is +meant for you to replace (including the angle brackets) with the correct +phrase before you run the chunk (otherwise you will get an error).

+ + + +
# Replace the <PUT_FILE_NAME_HERE> with the name of the file you are looking for
+# Remember to use quotes to make it a character string
+file.exists(<PUT_FILE_NAME_HERE>)
+ + + +

It doesn’t seem that file exists in our current directory, +but that doesn’t mean it doesn’t exist it all. In fact, this file is +inside the relative path data/, so let’s check +again if the whole relative path to that file exists.

+ + + +
# This time, use file.path() to form your argument to file.exists()
+file.exists(<PUT_PATH_TO_FILE_HERE>)
+ + + +

With the right relative path, we can confirm this file exists.

+
+

Read a TSV file

+

Declare the name of the directory where we will read in the data.

+ + + +
data_dir <- "data"
+ + + +

Although base R has functions to read in data files, the functions in +the readr package (part of the tidyverse) are faster and +more straightforward to use so we are going to use those here. Because +the file we are reading in is a TSV (tab separated values) file we will +be using the read_tsv function. There are analogous +functions for CSV (comma separated values) files +(read_csv()) and other files types.

+
+
+
+

Read in the differential expression analysis results file

+ + + +
stats_df <- readr::read_tsv(
+  file.path(data_dir,
+            "gene_results_GSE44971.tsv")
+  )
+ + +
Rows: 6804 Columns: 8
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr (3): ensembl_id, gene_symbol, contrast
+dbl (5): log_fold_change, avg_expression, t_statistic, p_value, adj_p_value
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

Following the template of the previous chunk, use this chunk to read +in the file GSE44971.tsv that is in the data +folder and save it in the variable gene_df.

+ + + +
# Use this chunk to read in data from the file `GSE44971.tsv`
+gene_df <- readr::read_tsv(
+  file.path(data_dir,
+            "GSE44971.tsv")
+  )
+ + +
Rows: 20056 Columns: 59
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr  (1): Gene
+dbl (58): GSM1094814, GSM1094815, GSM1094816, GSM1094817, GSM1094818, GSM109...
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

Use this chunk to explore what gene_df looks like.

+ + + +
# Explore `gene_df`
+ + + +

What information is contained in gene_df?

+
+
+

R pipes

+

One nifty feature that was added to R in version 4.1 is +the pipe: |>. Pipes are very handy things that allow you +to funnel the result of one expression to the next, making your code +more streamlined and fluently expressing the flow of data through a +series of operations.

+

Note: If you are using a version of R prior to +4.1 (or looking at older code), pipe functionality was available through +the magrittr package , which used a pipe that looked like +this: %>%. That pipe was the inspiration for the native +R pipe we are using here. While there are some minor differences, you +can mostly treat them interchangeably as long as you load the +magrittr package or dplyr, which also loads +that version of the pipe.

+

For example, the output from this:

+ + + +
filter(stats_df, contrast == "male_female")
+ +
+ +
+ + +

…is the same as the output from this:

+ + + +
stats_df |> filter(contrast == "male_female")
+ +
+ +
+ + +

This can make your code cleaner and easier to follow a series of +related commands. Let’s look at an example with our stats of of how the +same functions look with or without pipes:

+

Example 1: without pipes:

+ + + +
stats_arranged <- arrange(stats_df, t_statistic)
+stats_filtered <- filter(stats_arranged, avg_expression > 50)
+stats_nopipe <- select(stats_filtered, contrast, log_fold_change, p_value)
+ + + +

UGH, we have to keep track of all of those different intermediate +data frames and type their names so many times here! We could maybe +streamline things by using the same variable name at each stage, but +even then there is a lot of extra typing, and it is easy to get confused +about what has been done where. It’s annoying and makes it harder for +people to read.

+

Example 2: Same result as 1 but with pipes!

+ + + +
# Example of the same modifications as above but with pipes!
+stats_pipe  <- stats_df |>
+               arrange(t_statistic) |>
+               filter(avg_expression > 50) |>
+               select(contrast, log_fold_change, p_value)
+ + + +

What the |> (pipe) is doing here is feeding the +result of the expression on its left into the first argument of the next +function (to its right, or on the next line here). We can then skip that +first argument (the data in these cases), and move right on to the part +we care about at that step: what we are arranging, filtering, or +selecting in this case. The key insight that makes the pipe work here is +to recognize that each of these functions (arrange, +filter, and select) are fundamental +dplyr (a tidyverse package) functions which work as “data +in, data out.” In other words, these functions operate on data frames, +and return data frames; you give them a data frame, and they give you +back a data frame. Because these functions all follow a “data in, data +out” framework, we can chain them together with pipe and send data all +the way through the…pipeline!

+

Let’s double check that these versions with and without pipe yield +the same solution by using the base R function +all.equal().

+ + + +
all.equal(stats_nopipe, stats_pipe)
+ + +
[1] TRUE
+ + + +

all.equal() is letting us know that these two objects +are the same.

+

Now that hopefully you are convinced that the tidyverse can help you +make your code neater and easier to use and read, let’s go through some +of the popular tidyverse functions and so we can create pipelines like +this.

+
+
+

Common tidyverse functions

+

Let’s say we wanted to filter this gene expression dataset to +particular sample groups. In order to do this, we would use the function +filter() as well as a logic statement (usually one that +refers to a column or columns in the data frame).

+ + + +
# Here let's filter stats_df to only keep the gene_symbol "SNCA"
+stats_df |>
+  filter(gene_symbol == "SNCA")
+ +
+ +
+ + +

We can use filter() similarly for numeric +statements.

+ + + +
# Here let's filter the data to rows with average expression values above 50
+stats_df |>
+  filter(avg_expression > 50)
+ +
+ +
+ + +

We can apply multiple filters at once, which will require all of them +to be satisfied for every row in the results:

+ + + +
# filter to highly expressed genes with contrast "male_female"
+stats_df |>
+  filter(contrast == "male_female",
+         avg_expression > 50)
+ +
+ +
+ + +

When we are filtering, the %in% operator can come in +handy if we have multiple items we would like to match. Let’s take a +look at what using %in% does.

+ + + +
genes_of_interest <- c("SNCA", "CDKN1A")
+# Are these genes present in the `gene_symbol` column in stats_df?
+stats_df$gene_symbol %in% genes_of_interest
+ + + +

%in% returns a logical vector that now we can use in +dplyr::filter.

+ + + +
# filter to keep only genes of interest
+stats_df |>
+  filter(gene_symbol %in% c("SNCA", "CDKN1A"))
+ +
+ +
+ + +

Let’s return to our first filter() and build on to it. +This time, let’s keep only some of the columns from the data frame using +the select() function. Let’s also save this as a new data +frame called stats_filtered_df.

+ + + +
# filter to highly expressed "male_female"
+# and select gene_symbol, log_fold_change and t_statistic
+stats_filtered_df <- stats_df |>
+  filter(contrast == "male_female",
+         avg_expression > 50) |>
+  select(log_fold_change, t_statistic)
+ + + +

Let’s say we wanted to arrange this dataset so that the genes are +arranged by the smallest p values to the largest. In order to do this, +we would use the function arrange() as well as the column +we would like to sort by (in this case p_value).

+ + + +
stats_df |>
+  arrange(p_value)
+ +
+ +
+ + +

What if we want to sort from largest to smallest? Like if we want to +see the genes with the highest average expression? We can use the same +function, but instead use the desc() function and now we +are using avg_expression column.

+ + + +
# arrange descending by avg_expression
+stats_df |>
+  arrange(desc(avg_expression))
+ +
+ +
+ + +

What if we would like to create a new column of values? For that we +use mutate() function.

+ + + +
stats_df |>
+  mutate(log10_p_value = -log10(p_value))
+ +
+ +
+ + +

What if we want to obtain summary statistics for a column or columns? +The summarize function allows us to calculate summary +statistics for a column. Here we will use summarize to calculate two +summary statistics of log-fold change across all genes: mean (function +mean()) and standard deviation (function +sd()).

+ + + +
stats_df |>
+  summarize(mean(log_fold_change),
+            sd(log_fold_change))
+ +
+ +
+ + +

What if we’d like to obtain a summary statistics but have them for +various groups? Conveniently named, there’s a function called +group_by() that seamlessly allows us to do this. Also note +that group_by() allows us to group by multiple variables at +a time if you want to.

+ + + +
stats_summary_df <- stats_df |>
+      group_by(contrast) |>
+      summarize(mean(log_fold_change),
+                sd(log_fold_change))
+ + + +

Let’s look at a preview of what we made:

+ + + +
stats_summary_df
+ +
+ +
+ + +

Here we have the mean log fold change expression per each contrast we +made.

+
+
+

A brief intro to the apply family of functions

+

In base R, the apply family of functions can be an +alternative methods for performing transformations across a data frame, +matrix or other object structures.

+

One of this family is (shockingly) the function apply(), +which operates on matrices.

+

A matrix is similar to a data frame in that it is a rectangular table +of data, but it has an additional constraint: rather than each column +having a type, ALL data in a matrix has the same type.

+

The first argument to apply() is the data object we want +to work on. The third argument is the function we will apply to each row +or column of the data object. The second argument in specifies whether +we are applying the function across rows or across columns (1 for rows, +2 for columns).

+

Remember that gene_df is a gene x sample gene expression +data frame that has columns of two different types, character and +numeric. Converting it to a matrix will require us to make them all the +same type. We can coerce it into a matrix using +as.matrix(), in which case R will pick a type that it can +convert everything to. What does it choose?

+ + + +
# Coerce `gene_df` into a matrix
+gene_matrix <- as.matrix(gene_df)
+ + + + + + +
# Explore the structure of the `gene_matrix` object
+str(gene_matrix)
+ + +
 chr [1:20056, 1:59] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" ...
+ - attr(*, "dimnames")=List of 2
+  ..$ : NULL
+  ..$ : chr [1:59] "Gene" "GSM1094814" "GSM1094815" "GSM1094816" ...
+ + + +

While that worked, it is rare that we want numbers converted to text, +so we are going to select only the columns with numeric values before +converting it to a matrix. We can do this most easily by removing the +first column, which contains the gene names stored as character +values.

+ + + +
# Let's save a new matrix object names `gene_num_matrix` containing only
+# the numeric values
+gene_num_matrix <- as.matrix(gene_df[, -1])
+
+# Explore the structure of the `gene_num_matrix` object
+str(gene_num_matrix)
+ + +
 num [1:20056, 1:58] 9.5951 -0.0436 8.5246 1.6013 0.6189 ...
+ - attr(*, "dimnames")=List of 2
+  ..$ : NULL
+  ..$ : chr [1:58] "GSM1094814" "GSM1094815" "GSM1094816" "GSM1094817" ...
+ + + +

Why do we have a [, -1] after gene_df in +the above chunk?

+

Now that the matrix is all numbers, we can do things like calculate +the column or row statistics using apply().

+ + + +
# Calculate row means
+gene_means <- apply(gene_num_matrix, 1, mean) # Notice we are using 1 here
+
+# How long will `gene_means` be?
+length(gene_means)
+ + +
[1] 20056
+ + + +

Note that we can obtain the same results if we select just the +columns with numeric values from the gene_df data frame. +This allows R to do the as.matrix() coercion automatically, and can be a +handy shortcut if you have a mostly numeric data frame.

+ + + +
# Calculate row means using the `gene_df` object after removing the character column
+# apply() converts this to a matrix internally
+gene_means_from_df <- apply(gene_df[, -1], 1, mean)
+
+# Let's check that the two gene means objects are equal
+all.equal(gene_means, gene_means_from_df)
+ + +
[1] TRUE
+ + + +

Now let’s investigate the same set up, but use 2 to +apply over the columns of our matrix.

+ + + +
# Calculate sample means
+sample_means <- apply(gene_num_matrix, 2, mean) # Notice we use 2 here
+
+# How long will `sample_means` be?
+length(sample_means)
+ + +
[1] 58
+ + + +

We can put the gene names back into the numeric matrix object by +assigning them as rownames.

+ + + +
# Assign the gene names from gene_df$Gene to the `gene_num_matrix` object using
+# the `rownames()` function
+rownames(gene_num_matrix) <- gene_df$Gene
+
+# Explore the `gene_num_matrix` object
+head(gene_num_matrix)
+ + +
                 GSM1094814 GSM1094815 GSM1094816   GSM1094817  GSM1094818
+ENSG00000000003  9.59510150  8.4785070 12.6802129  8.677614838 10.75552946
+                 GSM1094819  GSM1094820 GSM1094821  GSM1094822 GSM1094823
+ENSG00000000003  6.37470691  9.10028584  7.3546860  8.51847190  9.4216113
+                GSM1094824  GSM1094825 GSM1094826  GSM1094827 GSM1094828
+ENSG00000000003  5.0239629  7.89737460  8.1126876  7.03444640  9.6984918
+                 GSM1094829 GSM1094830 GSM1094831 GSM1094832  GSM1094833
+ENSG00000000003 13.98689230 10.5868331  7.6836223  8.3862587 11.18932763
+                GSM1094834 GSM1094835  GSM1094836 GSM1094837 GSM1094838
+ENSG00000000003  9.7562003  9.6984918 10.56891510  9.9391025  7.8738131
+                GSM1094839  GSM1094840 GSM1094841 GSM1094842 GSM1094843
+ENSG00000000003  8.6311353  8.58077557  9.1579585  6.3317019 10.1939387
+                 GSM1094844  GSM1094845  GSM1094846 GSM1094847  GSM1094848
+ENSG00000000003 10.44364159  9.62435722 16.05075944  6.9334508  8.55180910
+                 GSM1094849 GSM1094850 GSM1094851  GSM1094852  GSM1094853
+ENSG00000000003  9.29497760  7.5027098  6.9593119  8.33588532  8.16826110
+                GSM1094854 GSM1094855 GSM1094856 GSM1094857  GSM1094858
+ENSG00000000003  9.8020077 7.92580451  8.5122426  8.5300217  6.45774124
+                 GSM1094859  GSM1094860 GSM1094861 GSM1094862 GSM1094863
+ENSG00000000003  8.06834906  6.29704946 9.59510150  8.2198571  6.0207988
+                 GSM1094864  GSM1094865  GSM1094866  GSM1094867  GSM1094868
+ENSG00000000003 10.60783176 10.23536609 9.031964212  7.66629540  1.06494502
+                GSM1094869 GSM1094870 GSM1094871
+ENSG00000000003  1.0408332  1.7262079  1.0292255
+ [ reached getOption("max.print") -- omitted 5 rows ]
+ + + +

Row names like this can be very convenient for keeping matrices +organized, but row names (and column names) can be lost or misordered if +you are not careful, especially during input and output, so treat them +with care.

+

Although the apply functions may not be as easy to use +as the tidyverse functions, for some applications, apply +methods can be better suited. In this workshop, we will not delve too +deeply into the various other apply functions (tapply(), +lapply(), etc.) but you can read more information about +them here.

+
+
+

The dplyr::join functions

+

Let’s say we have a scenario where we have two data frames that we +would like to combine. Recall that stats_df and +gene_df are data frames that contain information about some +of the same genes. The dplyr::join +family of functions are useful for various scenarios of combining +data frames. For a visual explanation, the tidyexplain +project has some helpful +animations of joins.

+

For now, we will focus on inner_join(), which will +combine data frames by only keeping information about matching rows that +are in both data frames. We need to use the by argument to +designate what column(s) should be used as a key to match the data +frames. In this case we want to match the gene information between the +two, so we will specify that we want to compare values in the +ensembl_id column from stats_df to the +Gene column from gene_df.

+ + + +
stats_df |>
+  # Join based on their shared column
+  # Called ensembl_id in stats_df and called Gene in gene_df
+  inner_join(gene_df, by = c('ensembl_id' = 'Gene'))
+ +
+ +
+ + +
+
+

Save data to files

+
+

Save to TSV files

+

Let’s write some of the data frames we created to a file. To do this, +we can use the readr library of write_() +functions. The first argument of write_tsv() is the data we +want to write, and the second argument is a character string that +describes the path to the new file we would like to create. Remember +that we created a results directory to put our output in, +but if we want to save our data to a directory other than our working +directory, we need to specify this. This is what we will use the +file.path() function for. Let’s look in a bit more detail +what file.path() does, by examining the results of the +function in the examples below.

+ + + +
# Which of these file paths is what we want to use to save our data to the
+# results directory we created at the beginning of this notebook?
+file.path("docker-install", "stats_summary.tsv")
+ + +
[1] "docker-install/stats_summary.tsv"
+ + +
docker-install/stats_summary.tsv
+ + +
file.path("results", "stats_summary.tsv")
+ + +
[1] "results/stats_summary.tsv"
+ + +
results/stats_summary.tsv
+ + +
file.path("stats_summary.tsv", "results")
+ + +
[1] "stats_summary.tsv/results"
+ + +
stats_summary.tsv/results
+ + + +

Replace <NEW_FILE_PATH> below with the +file.path() statement from above that will successfully +save our file to the results folder.

+ + + +
# Write our data frame to a TSV file
+readr::write_tsv(stats_summary_df, <NEW_FILE_PATH>)
+ + + +

Check in your results directory to see if your new file +has successfully saved.

+
+
+

Save to RDS files

+

For this example we have been working with data frames, which are +conveniently represented as TSV or CSV tables. However, in other +situations we may want to save more complicated or very large data +structures, RDS (R Data Serialized/Single) files may be a better option +for saving our data. RDS is R’s special file format for holding data +exactly as you have it in your R environment. RDS files can also be +compressed, meaning they will take up less space on your computer. Let’s +save our data to an RDS file in our results folder. You +will need to replace the .tsv with .RDS, but +you can use what we determined as our file path for the last chunk as +your template.

+ + + +
# Write your object to an RDS file
+readr::write_rds(stats_summary_df, <PUT_CORRECT_FILE_PATH_HERE>)
+ + + +
+
+

Read an RDS file

+

Since now you have learned the readr functions: +read_tsv(), write_tsv(), and now, +write_rds(), what do you suppose the function you will need +to read your RDS file is called? Use that function here to re-import +your data in the chunk we set up for you below.

+ + + +
# Read in your RDS file
+reimport_df <- <PUT_FUNCTION_NAME>(file.path("results", "stats_summary.RDS"))
+ + + +

As is good practice, we will end this session by printing out our +session info.

+
+
+

Session Info

+ + + +
# Print out the versions and packages we are using in this session
+sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats     graphics  grDevices utils     datasets  methods   base     
+
+other attached packages:
+ [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4    
+ [5] purrr_1.0.2     readr_2.1.5     tidyr_1.3.1     tibble_3.2.1   
+ [9] ggplot2_3.5.1   tidyverse_2.0.0 optparse_1.7.5 
+
+loaded via a namespace (and not attached):
+ [1] sass_0.4.9        utf8_1.2.4        generics_0.1.3    stringi_1.8.3    
+ [5] hms_1.1.3         digest_0.6.35     magrittr_2.0.3    evaluate_0.23    
+ [9] grid_4.4.0        timechange_0.3.0  fastmap_1.1.1     jsonlite_1.8.8   
+[13] fansi_1.0.6       scales_1.3.0      getopt_1.20.4     jquerylib_0.1.4  
+[17] cli_3.6.2         crayon_1.5.2      rlang_1.1.3       bit64_4.0.5      
+[21] munsell_0.5.1     withr_3.0.0       cachem_1.0.8      yaml_2.3.8       
+[25] parallel_4.4.0    tools_4.4.0       tzdb_0.4.0        colorspace_2.1-0 
+[29] vctrs_0.6.5       R6_2.5.1          lifecycle_1.0.4   bit_4.0.5        
+[33] fs_1.6.4          vroom_1.6.5       pkgconfig_2.0.3   pillar_1.9.0     
+[37] bslib_0.7.0       gtable_0.3.5      glue_1.7.0        xfun_0.43        
+[41] tidyselect_1.2.1  knitr_1.46        htmltools_0.5.8.1 rmarkdown_2.26   
+[45] compiler_4.4.0   
+ + +
+
+ +
LS0tCnRpdGxlOiAiSW50cm9kdWN0aW9uIHRvIHRpZHl2ZXJzZSIKYXV0aG9yOiAiQ0NETCBmb3IgQUxTRiIKZGF0ZTogMjAyMQpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazoKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiB0cnVlCmVkaXRvcl9vcHRpb25zOgogIGNodW5rX291dHB1dF90eXBlOiBpbmxpbmUKLS0tCgoKIyMgT2JqZWN0aXZlcwoKVGhpcyBub3RlYm9vayB3aWxsIGRlbW9uc3RyYXRlIGhvdyB0bzoKCi0gVXNlIGZ1bmN0aW9ucyBmcm9tIHRoZSB0aWR5dmVyc2UgdG8gcmVhZCBhbmQgd3JpdGUgZGF0YSBmcmFtZXMKLSBJbXBsZW1lbnQgYW5kIHVzZSB0aWR5dmVyc2UgZnVuY3Rpb25zIHRvIHdyYW5nbGUgZGF0YSAoaS5lLiBmaWx0ZXIsIG11dGF0ZSwgYXJyYW5nZSwgam9pbikKLSBVc2UgUiBwaXBlcyAoYHw+YCkgdG8gY29tYmluZSBtdWx0aXBsZSBvcGVyYXRpb25zCi0gVXNlIHRoZSBgYXBwbHkoKWAgZnVuY3Rpb24gdG8gYXBwbHkgZnVuY3Rpb25zIGFjcm9zcyByb3dzIG9yIGNvbHVtbnMgb2YgYSBtYXRyaXgKCi0tLQoKV2UnbGwgdXNlIHRoZSBzYW1lIGdlbmUgZXhwcmVzc2lvbiBkYXRhc2V0IHdlIHVzZWQgaW4gdGhlIFtwcmV2aW91cyBub3RlYm9va10oLi8wMi1pbnRyb190b19nZ3Bsb3QyLlJtZCkuCkl0IGlzIGEgcHJlLXByb2Nlc3NlZCBbYXN0cm9jeXRvbWEgbWljcm9hcnJheSBkYXRhc2V0XShodHRwczovL3d3dy5yZWZpbmUuYmlvL2V4cGVyaW1lbnRzL0dTRTQ0OTcxL2dlbmUtZXhwcmVzc2lvbi1kYXRhLWZyb20tcGlsb2N5dGljLWFzdHJvY3l0b21hLXR1bW91ci1zYW1wbGVzLWFuZC1ub3JtYWwtY2VyZWJlbGx1bS1jb250cm9scykgdGhhdCB3ZSBwZXJmb3JtZWQgYSBzZXQgb2YgW2RpZmZlcmVudGlhbCBleHByZXNzaW9uIGFuYWx5c2VzIG9uXSguL3NjcmlwdHMvMDAtc2V0dXAtaW50cm8tdG8tUi5SKS4KCioqTW9yZSB0aWR5dmVyc2UgcmVzb3VyY2VzOioqCgotIFtSIGZvciBEYXRhIFNjaWVuY2VdKGh0dHBzOi8vcjRkcy5oYWQuY28ubnovKQotIFt0aWR5dmVyc2UgZG9jdW1lbnRhdGlvbl0oaHR0cHM6Ly90aWR5dmVyc2Uub3JnLykKICAtIFtgZHBseXJgIGRvY3VtZW50YXRpb25dKGh0dHBzOi8vZHBseXIudGlkeXZlcnNlLm9yZy8pCiAgLSBbYHJlYWRyYCBkb2N1bWVudGF0aW9uXShodHRwczovL3JlYWRyLnRpZHl2ZXJzZS5vcmcvKQotIFtDaGVhdHNoZWV0IG9mIHRpZHl2ZXJzZSBkYXRhIHRyYW5zZm9ybWF0aW9uXShodHRwczovL2dpdGh1Yi5jb20vcnN0dWRpby9jaGVhdHNoZWV0cy9yYXcvbWFpbi9kYXRhLXRyYW5zZm9ybWF0aW9uLnBkZikKLSBbT25saW5lIHRpZHl2ZXJzZSBib29rIGNoYXB0ZXJdKGh0dHBzOi8vcHJpdmVmbC5naXRodWIuaW8vYWR2cjM4Ym9vay90aWR5dmVyc2UuaHRtbCkKCiMjIFNldCBVcAoKVGhlIHRpZHl2ZXJzZSBpcyBhIGNvbGxlY3Rpb24gb2YgcGFja2FnZXMgdGhhdCBhcmUgaGFuZHkgZm9yIGdlbmVyYWwgZGF0YSB3cmFuZ2xpbmcsIGFuYWx5c2lzLCBhbmQgdmlzdWFsaXphdGlvbi4KT3RoZXIgcGFja2FnZXMgdGhhdCBhcmUgc3BlY2lmaWNhbGx5IGhhbmR5IGZvciBkaWZmZXJlbnQgYmlvbG9naWNhbCBhbmFseXNlcyBhcmUgZm91bmQgb24gW0Jpb2NvbmR1Y3Rvcl0oaHR0cHM6Ly93d3cuYmlvY29uZHVjdG9yLm9yZy8pLgpJZiB3ZSB3YW50IHRvIHVzZSBhIHBhY2thZ2UncyBmdW5jdGlvbnMgd2UgZmlyc3QgbmVlZCB0byBpbnN0YWxsIHRoZW0uCgpPdXIgUlN0dWRpbyBTZXJ2ZXIgYWxyZWFkeSBoYXMgdGhlIGB0aWR5dmVyc2VgIGdyb3VwIG9mIHBhY2thZ2VzIGluc3RhbGxlZCBmb3IgeW91LgpCdXQgaWYgeW91IG5lZWRlZCB0byBpbnN0YWxsIGl0IG9yIG90aGVyIHBhY2thZ2VzIGF2YWlsYWJsZSBvbiBDUkFOLCB5b3UgZG8gaXQgdXNpbmcgdGhlIGBpbnN0YWxsLnBhY2thZ2VzKClgIGZ1bmN0aW9uIGxpa2UgdGhpczogYGluc3RhbGwucGFja2FnZXMoInRpZHl2ZXJzZSIpYC4KCmBgYHtyIHRpZHl2ZXJzZX0KbGlicmFyeSh0aWR5dmVyc2UpCmBgYAoKIyMjIFJlZmVyZW5jaW5nIGEgbGlicmFyeSdzIGZ1bmN0aW9uIHdpdGggYDo6YAoKTm90ZSB0aGF0IGlmIHdlIGhhZCBub3QgaW1wb3J0ZWQgdGhlIHRpZHl2ZXJzZSBzZXQgb2YgcGFja2FnZXMgdXNpbmcgYGxpYnJhcnkoKWAgbGlrZSBhYm92ZSwgYW5kIHdlIHdhbnRlZCB0byB1c2UgYSB0aWR5dmVyc2UgZnVuY3Rpb24gbGlrZSBgcmVhZF90c3YoKWAsIHdlIHdvdWxkIG5lZWQgdG8gdGVsbCBSIHdoYXQgcGFja2FnZSB0byBmaW5kIHRoaXMgZnVuY3Rpb24gaW4uClRvIGRvIHRoaXMsIHdlIHdvdWxkIHVzZSBgOjpgIHRvIHRlbGwgUiB0byBsb2FkIGluIHRoaXMgZnVuY3Rpb24gZnJvbSB0aGUgYHJlYWRyYCBwYWNrYWdlIGJ5IHVzaW5nIGByZWFkcjo6cmVhZF90c3YoKWAuCllvdSB3aWxsIHNlZSB0aGlzIGA6OmAgbWV0aG9kIG9mIHJlZmVyZW5jaW5nIGxpYnJhcmllcyB3aXRoaW4gcGFja2FnZXMgdGhyb3VnaG91dCB0aGUgY291cnNlLgpXZSBsaWtlIHRvIHVzZSBpdCBpbiBwYXJ0IHRvIHJlbW92ZSBhbnkgYW1iaWd1aXR5IGluIHdoaWNoIHZlcnNpb24gb2YgYSBmdW5jdGlvbiB3ZSBhcmUgdXNpbmc7IGl0IGlzIG5vdCB0b28gdW5jb21tb24gZm9yIGRpZmZlcmVudCBwYWNrYWdlcyB0byB1c2UgdGhlIHNhbWUgbmFtZSBmb3IgdmVyeSBkaWZmZXJlbnQgZnVuY3Rpb25zIQoKIyMgTWFuYWdpbmcgZGlyZWN0b3JpZXMKCkJlZm9yZSB3ZSBjYW4gaW1wb3J0IHRoZSBkYXRhIHdlIG5lZWQsIHdlIHNob3VsZCBkb3VibGUgY2hlY2sgd2hlcmUgUiBpcyBsb29raW5nIGZvciBmaWxlcywgYWthIHRoZSBjdXJyZW50ICoqd29ya2luZyBkaXJlY3RvcnkqKi4KV2UgY2FuIGRvIHRoaXMgYnkgdXNpbmcgdGhlIGBnZXR3ZCgpYCBmdW5jdGlvbiwgd2hpY2ggd2lsbCB0ZWxsIHVzIHdoYXQgZm9sZGVyIHdlIGFyZSBpbi4KCmBgYHtyIHdvcmtpbmdkaXIsIGxpdmUgPSBUUlVFfQojIExldCdzIGNoZWNrIHdoYXQgZGlyZWN0b3J5IHdlIGFyZSBpbjoKZ2V0d2QoKQpgYGAKCkZvciBSbWQgZmlsZXMsIHRoZSB3b3JraW5nIGRpcmVjdG9yeSBpcyB3aGVyZXZlciB0aGUgZmlsZSBpcyBsb2NhdGVkLCBidXQgY29tbWFuZHMgZXhlY3V0ZWQgaW4gdGhlIGNvbnNvbGUgbWF5IGhhdmUgYSBkaWZmZXJlbnQgd29ya2luZyBkaXJlY3RvcnkuCgpXZSB3aWxsIHdhbnQgdG8gbWFrZSBhIGRpcmVjdG9yeSBmb3Igb3VyIG91dHB1dCBhbmQgd2Ugd2lsbCBjYWxsIHRoaXMgZGlyZWN0b3J5OiBgcmVzdWx0c2AuCkJ1dCBiZWZvcmUgd2UgY3JlYXRlIHRoZSBkaXJlY3RvcnksIHdlIHNob3VsZCBjaGVjayBpZiBpdCBhbHJlYWR5IGV4aXN0cy4KV2Ugd2lsbCBzaG93IHR3byB3YXlzIHRoYXQgd2UgY2FuIGRvIHRoaXMuCgpGaXJzdCwgd2UgY2FuIHVzZSB0aGUgYGRpcigpYCBmdW5jdGlvbiB0byBoYXZlIFIgbGlzdCB0aGUgZmlsZXMgaW4gb3VyIHdvcmtpbmcgZGlyZWN0b3J5LgoKYGBge3J9CiMgTGV0J3MgY2hlY2sgd2hhdCBmaWxlcyBhcmUgaGVyZQpkaXIoKQpgYGAKClRoaXMgc2hvd3MgdXMgdGhlcmUgaXMgbm8gZm9sZGVyIGNhbGxlZCAicmVzdWx0cyIgeWV0LgoKSWYgd2Ugd2FudCB0byBtb3JlIHBvaW50ZWRseSBsb29rIGZvciAicmVzdWx0cyIgaW4gb3VyIHdvcmtpbmcgZGlyZWN0b3J5IHdlIGNhbiB1c2UgdGhlIGBkaXIuZXhpc3RzKClgIGZ1bmN0aW9uLgoKYGBge3IgY2hlY2stZGlyLCBsaXZlID0gVFJVRX0KIyBDaGVjayBpZiB0aGUgcmVzdWx0cyBkaXJlY3RvcnkgZXhpc3RzCmRpci5leGlzdHMoInJlc3VsdHMiKQpgYGAKCklmIHRoZSBhYm92ZSBzYXlzIGBGQUxTRWAgdGhhdCBtZWFucyB3ZSB3aWxsIG5lZWQgdG8gY3JlYXRlIGEgYHJlc3VsdHNgIGRpcmVjdG9yeS4KV2UndmUgcHJldmlvdXNseSBzZWVuIHRoYXQgd2UgY2FuIG1ha2UgZGlyZWN0b3JpZXMgaW4gUiB1c2luZyB0aGUgYmFzZSBSIGZ1bmN0aW9uIGBkaXIuY3JlYXRlKClgLgpCdXQgd2UndmUgYWxzbyBzZWVuIHRoYXQgdGhpcyBmdW5jdGlvbiB3aWxsIHRocm93IGFuIGVycm9yIGlmIHlvdSB0cnkgdG8gY3JlYXRlIGEgZGlyZWN0b3J5IHRoYXQgYWxyZWFkeSBleGlzdHMsIHdoaWNoIGNhbiBiZSBmcnVzdHJhdGluZyBpZiB5b3UgYXJlIHJlLXJ1bm5pbmcgY29kZSEKQSBkaWZmZXJlbnQgb3B0aW9uIGlzIHRvIHVzZSB0aGUgW2Bmc2BdKGh0dHBzOi8vZnMuci1saWIub3JnLykgcGFja2FnZSwgd2hpY2ggcHJvdmlkZXMgZnVuY3Rpb25zIGZvciB5b3UgdG8gaW50ZXJhY3Qgd2l0aCB5b3VyIGNvbXB1dGVyJ3MgZmlsZSBzeXN0ZW0gd2l0aCBhIG1vcmUgY29uc2lzdGVudCBiZWhhdmlvciB0aGFuIHRoZSBiYXNlIFIgZnVuY3Rpb25zLgpPbmUgZnVuY3Rpb24gZnJvbSB0aGlzIHBhY2thZ2UgaXMgYGZzOjpkaXJfY3JlYXRlKClgIChub3RlIHRoYXQgaXQgaGFzIGFuIF91bmRlcnNjb3JlXywgbm90IGEgcGVyaW9kKSwgYW5kIG11Y2ggbGlrZSB0aGUgYmFzZSBSIGBkaXIuY3JlYXRlKClgLCBpdCBjcmVhdGVzIGRpcmVjdG9yaWVzLiAKSXQgaGFzIHNvbWUgb3RoZXIgaGVscGZ1bCBmZWF0dXJlcyB0b286Ci0gSXQgd2lsbCBzaW1wbHkgZG8gbm90aGluZyBpZiB0aGF0IGRpcmVjdG9yeSBhbHJlYWR5IGV4aXN0czsgbm8gZXJyb3JzLCBhbmQgbm90aGluZyB3aWxsIGdldCBvdmVyd3JpdHRlbgotIEl0IGFsbG93cyBjcmVhdGluZyBfbmVzdGVkXyBkaXJlY3RvcmllcyBieSBkZWZhdWx0LCBpLmUuIGluIG9uZSBjYWxsIG1ha2UgZGlyZWN0b3JpZXMgaW5zaWRlIG9mIG90aGVyIGRpcmVjdG9yaWVzCgpMZXQncyBnbyBhaGVhZCBhbmQgdXNlIGl0IHRvIGNyZWF0ZSBvdXIgYHJlc3VsdHNgIGRpcmVjdG9yeToKCmBgYHtyIGNyZWF0ZS1kaXIsIGxpdmUgPSBUUlVFfQojIE1ha2UgYSBkaXJlY3Rvcnkgd2l0aGluIHRoZSB3b3JraW5nIGRpcmVjdG9yeSBjYWxsZWQgJ3Jlc3VsdHMnCmZzOjpkaXJfY3JlYXRlKCJyZXN1bHRzIikKYGBgCgpBZnRlciBjcmVhdGluZyB0aGUgcmVzdWx0cyBkaXJlY3RvcnkgYWJvdmUsIGxldCdzIHJlLXJ1biBgZGlyLmV4aXN0cygpYCB0byBzZWUgaWYgbm93IGl0IGV4aXN0cy4KCmBgYHtyIGNoZWNrLWRpci1hZ2FpbiwgbGl2ZSA9IFRSVUV9CiMgUmUtY2hlY2sgaWYgdGhlIHJlc3VsdHMgZGlyZWN0b3J5IGV4aXN0cwpkaXIuZXhpc3RzKCJyZXN1bHRzIikKYGBgCgpUaGUgYGRpci5leGlzdHMoKWAgZnVuY3Rpb24gd2lsbCBub3Qgd29yayBvbiBmaWxlcyB0aGVtc2VsdmVzLgpJbiB0aGF0IGNhc2UsIHRoZXJlIGlzIGFuIGFuYWxvZ291cyBmdW5jdGlvbiBjYWxsZWQgYGZpbGUuZXhpc3RzKClgLgoKVHJ5IHVzaW5nIHRoZSBgZmlsZS5leGlzdHMoKWAgZnVuY3Rpb24gdG8gc2VlIGlmIHRoZSBmaWxlIGBnZW5lX3Jlc3VsdHNfR1NFNDQ5NzEudHN2YCBleGlzdHMgaW4gdGhlIGN1cnJlbnQgZGlyZWN0b3J5LgpVc2UgdGhlIGNvZGUgY2h1bmsgd2Ugc2V0IHVwIGZvciB5b3UgYmVsb3cuCk5vdGUgdGhhdCBpbiBvdXIgbm90ZWJvb2tzIChhbmQgc29tZXRpbWVzIGVsc2V3aGVyZSksIHdoZXJldmVyIHlvdSBzZWUgYSBgPEZJTExfSU5fVEhFX0JMQU5LPmAgbGlrZSBpbiB0aGUgY2h1bmsgYmVsb3csIHRoYXQgaXMgbWVhbnQgZm9yIHlvdSB0byByZXBsYWNlIChpbmNsdWRpbmcgdGhlIGFuZ2xlIGJyYWNrZXRzKSB3aXRoIHRoZSBjb3JyZWN0IHBocmFzZSBiZWZvcmUgeW91IHJ1biB0aGUgY2h1bmsgKG90aGVyd2lzZSB5b3Ugd2lsbCBnZXQgYW4gZXJyb3IpLgoKYGBge3IgZmlsZS1jaGVjaywgZXZhbD1GQUxTRX0KIyBSZXBsYWNlIHRoZSA8UFVUX0ZJTEVfTkFNRV9IRVJFPiB3aXRoIHRoZSBuYW1lIG9mIHRoZSBmaWxlIHlvdSBhcmUgbG9va2luZyBmb3IKIyBSZW1lbWJlciB0byB1c2UgcXVvdGVzIHRvIG1ha2UgaXQgYSBjaGFyYWN0ZXIgc3RyaW5nCmZpbGUuZXhpc3RzKDxQVVRfRklMRV9OQU1FX0hFUkU+KQpgYGAKCkl0IGRvZXNuJ3Qgc2VlbSB0aGF0IGZpbGUgZXhpc3RzIGluIG91ciBfY3VycmVudCBkaXJlY3RvcnlfLCBidXQgdGhhdCBkb2Vzbid0IG1lYW4gaXQgZG9lc24ndCBleGlzdCBpdCBhbGwuCkluIGZhY3QsIHRoaXMgZmlsZSBpcyBpbnNpZGUgdGhlIF9yZWxhdGl2ZSBwYXRoXyBgZGF0YS9gLCBzbyBsZXQncyBjaGVjayBhZ2FpbiBpZiB0aGUgd2hvbGUgcmVsYXRpdmUgcGF0aCB0byB0aGF0IGZpbGUgZXhpc3RzLgoKCmBgYHtyIGZpbGUtY2hlY2stcGF0aCwgZXZhbD1GQUxTRX0KIyBUaGlzIHRpbWUsIHVzZSBmaWxlLnBhdGgoKSB0byBmb3JtIHlvdXIgYXJndW1lbnQgdG8gZmlsZS5leGlzdHMoKQpmaWxlLmV4aXN0cyg8UFVUX1BBVEhfVE9fRklMRV9IRVJFPikKYGBgCgpXaXRoIHRoZSByaWdodCByZWxhdGl2ZSBwYXRoLCB3ZSBjYW4gY29uZmlybSB0aGlzIGZpbGUgZXhpc3RzLgoKIyMjIyBSZWFkIGEgVFNWIGZpbGUKCkRlY2xhcmUgdGhlIG5hbWUgb2YgdGhlIGRpcmVjdG9yeSB3aGVyZSB3ZSB3aWxsIHJlYWQgaW4gdGhlIGRhdGEuCgpgYGB7cn0KZGF0YV9kaXIgPC0gImRhdGEiCmBgYAoKQWx0aG91Z2ggYmFzZSBSIGhhcyBmdW5jdGlvbnMgdG8gcmVhZCBpbiBkYXRhIGZpbGVzLCB0aGUgZnVuY3Rpb25zIGluIHRoZSBgcmVhZHJgIHBhY2thZ2UgKHBhcnQgb2YgdGhlIHRpZHl2ZXJzZSkgYXJlIGZhc3RlciBhbmQgbW9yZSBzdHJhaWdodGZvcndhcmQgdG8gdXNlIHNvIHdlIGFyZSBnb2luZyB0byB1c2UgdGhvc2UgaGVyZS4KQmVjYXVzZSB0aGUgZmlsZSB3ZSBhcmUgcmVhZGluZyBpbiBpcyBhIFRTViAodGFiIHNlcGFyYXRlZCB2YWx1ZXMpIGZpbGUgd2Ugd2lsbCBiZSB1c2luZyB0aGUgYHJlYWRfdHN2YCBmdW5jdGlvbi4KVGhlcmUgYXJlIGFuYWxvZ291cyBmdW5jdGlvbnMgZm9yIENTViAoY29tbWEgc2VwYXJhdGVkIHZhbHVlcykgZmlsZXMgKGByZWFkX2NzdigpYCkgYW5kIG90aGVyIGZpbGVzIHR5cGVzLgoKIyMgUmVhZCBpbiB0aGUgZGlmZmVyZW50aWFsIGV4cHJlc3Npb24gYW5hbHlzaXMgcmVzdWx0cyBmaWxlCgpgYGB7ciByZWFkLXJlc3VsdHN9CnN0YXRzX2RmIDwtIHJlYWRyOjpyZWFkX3RzdigKICBmaWxlLnBhdGgoZGF0YV9kaXIsCiAgICAgICAgICAgICJnZW5lX3Jlc3VsdHNfR1NFNDQ5NzEudHN2IikKICApCmBgYAoKRm9sbG93aW5nIHRoZSB0ZW1wbGF0ZSBvZiB0aGUgcHJldmlvdXMgY2h1bmssIHVzZSB0aGlzIGNodW5rIHRvIHJlYWQgaW4gdGhlIGZpbGUgYEdTRTQ0OTcxLnRzdmAgdGhhdCBpcyBpbiB0aGUgYGRhdGFgIGZvbGRlciBhbmQgc2F2ZSBpdCBpbiB0aGUgdmFyaWFibGUgYGdlbmVfZGZgLgoKYGBge3IgcmVhZC1leHByLCBsaXZlID0gVFJVRX0KIyBVc2UgdGhpcyBjaHVuayB0byByZWFkIGluIGRhdGEgZnJvbSB0aGUgZmlsZSBgR1NFNDQ5NzEudHN2YApnZW5lX2RmIDwtIHJlYWRyOjpyZWFkX3RzdigKICBmaWxlLnBhdGgoZGF0YV9kaXIsCiAgICAgICAgICAgICJHU0U0NDk3MS50c3YiKQogICkKYGBgCgpVc2UgdGhpcyBjaHVuayB0byBleHBsb3JlIHdoYXQgYGdlbmVfZGZgIGxvb2tzIGxpa2UuCgpgYGB7ciBleHBsb3JlfQojIEV4cGxvcmUgYGdlbmVfZGZgCgpgYGAKCldoYXQgaW5mb3JtYXRpb24gaXMgY29udGFpbmVkIGluIGBnZW5lX2RmYD8KCiMjIGBSYCBwaXBlcwoKT25lIG5pZnR5IGZlYXR1cmUgdGhhdCB3YXMgYWRkZWQgdG8gYFJgIGluIHZlcnNpb24gNC4xIGlzIHRoZSBwaXBlOiBgfD5gLgpQaXBlcyBhcmUgdmVyeSBoYW5keSB0aGluZ3MgdGhhdCBhbGxvdyB5b3UgdG8gZnVubmVsIHRoZSByZXN1bHQgb2Ygb25lIGV4cHJlc3Npb24gdG8gdGhlIG5leHQsIG1ha2luZyB5b3VyIGNvZGUgbW9yZSBzdHJlYW1saW5lZCBhbmQgZmx1ZW50bHkgZXhwcmVzc2luZyB0aGUgZmxvdyBvZiBkYXRhIHRocm91Z2ggYSBzZXJpZXMgb2Ygb3BlcmF0aW9ucy4KCl9Ob3RlOl8gSWYgeW91IGFyZSB1c2luZyBhIHZlcnNpb24gb2YgYFJgIHByaW9yIHRvIDQuMSAob3IgbG9va2luZyBhdCBvbGRlciBjb2RlKSwgcGlwZSBmdW5jdGlvbmFsaXR5IHdhcyBhdmFpbGFibGUgdGhyb3VnaCB0aGUgYG1hZ3JpdHRyYCBwYWNrYWdlICwgd2hpY2ggdXNlZCBhIHBpcGUgdGhhdCBsb29rZWQgbGlrZSB0aGlzOiBgJT4lYC4KVGhhdCBwaXBlIHdhcyB0aGUgaW5zcGlyYXRpb24gZm9yIHRoZSBuYXRpdmUgUiBwaXBlIHdlIGFyZSB1c2luZyBoZXJlLiAKV2hpbGUgdGhlcmUgYXJlIHNvbWUgbWlub3IgZGlmZmVyZW5jZXMsIHlvdSBjYW4gbW9zdGx5IHRyZWF0IHRoZW0gaW50ZXJjaGFuZ2VhYmx5IGFzIGxvbmcgYXMgeW91IGxvYWQgdGhlIGBtYWdyaXR0cmAgcGFja2FnZSBvciBgZHBseXJgLCB3aGljaCBhbHNvIGxvYWRzIHRoYXQgdmVyc2lvbiBvZiB0aGUgcGlwZS4KCkZvciBleGFtcGxlLCB0aGUgb3V0cHV0IGZyb20gdGhpczoKCmBgYHtyIGZpbHRlcn0KZmlsdGVyKHN0YXRzX2RmLCBjb250cmFzdCA9PSAibWFsZV9mZW1hbGUiKQpgYGAKCi4uLmlzIHRoZSBzYW1lIGFzIHRoZSBvdXRwdXQgZnJvbSB0aGlzOgoKYGBge3IgZmlsdGVyLXBpcGV9CnN0YXRzX2RmIHw+IGZpbHRlcihjb250cmFzdCA9PSAibWFsZV9mZW1hbGUiKQpgYGAKClRoaXMgY2FuIG1ha2UgeW91ciBjb2RlIGNsZWFuZXIgYW5kIGVhc2llciB0byBmb2xsb3cgYSBzZXJpZXMgb2YgcmVsYXRlZCBjb21tYW5kcy4KTGV0J3MgbG9vayBhdCBhbiBleGFtcGxlIHdpdGggb3VyIHN0YXRzIG9mIG9mIGhvdyB0aGUgc2FtZSBmdW5jdGlvbnMgbG9vayB3aXRoIG9yIHdpdGhvdXQgcGlwZXM6CgoqRXhhbXBsZSAxOiogd2l0aG91dCBwaXBlczoKCmBgYHtyIHN0ZXBzLW5vcGlwZX0Kc3RhdHNfYXJyYW5nZWQgPC0gYXJyYW5nZShzdGF0c19kZiwgdF9zdGF0aXN0aWMpCnN0YXRzX2ZpbHRlcmVkIDwtIGZpbHRlcihzdGF0c19hcnJhbmdlZCwgYXZnX2V4cHJlc3Npb24gPiA1MCkKc3RhdHNfbm9waXBlIDwtIHNlbGVjdChzdGF0c19maWx0ZXJlZCwgY29udHJhc3QsIGxvZ19mb2xkX2NoYW5nZSwgcF92YWx1ZSkKYGBgCgpVR0gsIHdlIGhhdmUgdG8ga2VlcCB0cmFjayBvZiBhbGwgb2YgdGhvc2UgZGlmZmVyZW50IGludGVybWVkaWF0ZSBkYXRhIGZyYW1lcyBhbmQgdHlwZSB0aGVpciBuYW1lcyBzbyBtYW55IHRpbWVzIGhlcmUhCldlIGNvdWxkIG1heWJlIHN0cmVhbWxpbmUgdGhpbmdzIGJ5IHVzaW5nIHRoZSBzYW1lIHZhcmlhYmxlIG5hbWUgYXQgZWFjaCBzdGFnZSwgYnV0IGV2ZW4gdGhlbiB0aGVyZSBpcyBhIGxvdCBvZiBleHRyYSB0eXBpbmcsIGFuZCBpdCBpcyBlYXN5IHRvIGdldCBjb25mdXNlZCBhYm91dCB3aGF0IGhhcyBiZWVuIGRvbmUgd2hlcmUuCkl0J3MgYW5ub3lpbmcgYW5kIG1ha2VzIGl0IGhhcmRlciBmb3IgcGVvcGxlIHRvIHJlYWQuCgoqRXhhbXBsZSAyOiogU2FtZSByZXN1bHQgYXMgMSBidXQgd2l0aCBwaXBlcyEKCmBgYHtyIHN0ZXBzLXBpcGUsIGxpdmUgPSBUUlVFfQojIEV4YW1wbGUgb2YgdGhlIHNhbWUgbW9kaWZpY2F0aW9ucyBhcyBhYm92ZSBidXQgd2l0aCBwaXBlcyEKc3RhdHNfcGlwZSAgPC0gc3RhdHNfZGYgfD4KICAgICAgICAgICAgICAgYXJyYW5nZSh0X3N0YXRpc3RpYykgfD4KICAgICAgICAgICAgICAgZmlsdGVyKGF2Z19leHByZXNzaW9uID4gNTApIHw+CiAgICAgICAgICAgICAgIHNlbGVjdChjb250cmFzdCwgbG9nX2ZvbGRfY2hhbmdlLCBwX3ZhbHVlKQpgYGAKCldoYXQgdGhlIGB8PmAgKHBpcGUpIGlzIGRvaW5nIGhlcmUgaXMgZmVlZGluZyB0aGUgcmVzdWx0IG9mIHRoZSBleHByZXNzaW9uIG9uIGl0cyBsZWZ0IGludG8gdGhlIGZpcnN0IGFyZ3VtZW50IG9mIHRoZSBuZXh0IGZ1bmN0aW9uICh0byBpdHMgcmlnaHQsIG9yIG9uIHRoZSBuZXh0IGxpbmUgaGVyZSkuCldlIGNhbiB0aGVuIHNraXAgdGhhdCBmaXJzdCBhcmd1bWVudCAodGhlIGRhdGEgaW4gdGhlc2UgY2FzZXMpLCBhbmQgbW92ZSByaWdodCBvbiB0byB0aGUgcGFydCB3ZSBjYXJlIGFib3V0IGF0IHRoYXQgc3RlcDogd2hhdCB3ZSBhcmUgYXJyYW5naW5nLCBmaWx0ZXJpbmcsIG9yIHNlbGVjdGluZyBpbiB0aGlzIGNhc2UuClRoZSBrZXkgaW5zaWdodCB0aGF0IG1ha2VzIHRoZSBwaXBlIHdvcmsgaGVyZSBpcyB0byByZWNvZ25pemUgdGhhdCBlYWNoIG9mIHRoZXNlIGZ1bmN0aW9ucyAoYGFycmFuZ2VgLCBgZmlsdGVyYCwgYW5kIGBzZWxlY3RgKSBhcmUgZnVuZGFtZW50YWwgYGRwbHlyYCAoYSB0aWR5dmVyc2UgcGFja2FnZSkgZnVuY3Rpb25zIHdoaWNoIHdvcmsgYXMgImRhdGEgaW4sIGRhdGEgb3V0LiIKSW4gb3RoZXIgd29yZHMsIHRoZXNlIGZ1bmN0aW9ucyBvcGVyYXRlIG9uIGRhdGEgZnJhbWVzLCBhbmQgcmV0dXJuIGRhdGEgZnJhbWVzOyB5b3UgZ2l2ZSB0aGVtIGEgZGF0YSBmcmFtZSwgYW5kIHRoZXkgZ2l2ZSB5b3UgYmFjayBhIGRhdGEgZnJhbWUuCkJlY2F1c2UgdGhlc2UgZnVuY3Rpb25zIGFsbCBmb2xsb3cgYSAiZGF0YSBpbiwgZGF0YSBvdXQiIGZyYW1ld29yaywgd2UgY2FuIGNoYWluIHRoZW0gdG9nZXRoZXIgd2l0aCBwaXBlIGFuZCBzZW5kIGRhdGEgYWxsIHRoZSB3YXkgdGhyb3VnaCB0aGUuLi5waXBlbGluZSEKCkxldCdzIGRvdWJsZSBjaGVjayB0aGF0IHRoZXNlIHZlcnNpb25zIHdpdGggYW5kIHdpdGhvdXQgcGlwZSB5aWVsZCB0aGUgc2FtZSBzb2x1dGlvbiBieSB1c2luZyB0aGUgYmFzZSBSIGZ1bmN0aW9uIGBhbGwuZXF1YWwoKWAuCgpgYGB7ciBjaGVjay1waXBlfQphbGwuZXF1YWwoc3RhdHNfbm9waXBlLCBzdGF0c19waXBlKQpgYGAKCmBhbGwuZXF1YWwoKWAgaXMgbGV0dGluZyB1cyBrbm93IHRoYXQgdGhlc2UgdHdvIG9iamVjdHMgYXJlIHRoZSBzYW1lLgoKTm93IHRoYXQgaG9wZWZ1bGx5IHlvdSBhcmUgY29udmluY2VkIHRoYXQgdGhlIHRpZHl2ZXJzZSBjYW4gaGVscCB5b3UgbWFrZSB5b3VyIGNvZGUgbmVhdGVyIGFuZCBlYXNpZXIgdG8gdXNlIGFuZCByZWFkLCBsZXQncyBnbyB0aHJvdWdoIHNvbWUgb2YgdGhlIHBvcHVsYXIgdGlkeXZlcnNlIGZ1bmN0aW9ucyBhbmQgc28gd2UgY2FuIGNyZWF0ZSBwaXBlbGluZXMgbGlrZSB0aGlzLgoKCiMjIENvbW1vbiB0aWR5dmVyc2UgZnVuY3Rpb25zCgpMZXQncyBzYXkgd2Ugd2FudGVkIHRvIGZpbHRlciB0aGlzIGdlbmUgZXhwcmVzc2lvbiBkYXRhc2V0IHRvIHBhcnRpY3VsYXIgc2FtcGxlIGdyb3Vwcy4KSW4gb3JkZXIgdG8gZG8gdGhpcywgd2Ugd291bGQgdXNlIHRoZSBmdW5jdGlvbiBgZmlsdGVyKClgIGFzIHdlbGwgYXMgYSBsb2dpYyBzdGF0ZW1lbnQgKHVzdWFsbHkgb25lIHRoYXQgcmVmZXJzIHRvIGEgY29sdW1uIG9yIGNvbHVtbnMgaW4gdGhlIGRhdGEgZnJhbWUpLgoKYGBge3IgZmlsdGVyLWdlbmV9CiMgSGVyZSBsZXQncyBmaWx0ZXIgc3RhdHNfZGYgdG8gb25seSBrZWVwIHRoZSBnZW5lX3N5bWJvbCAiU05DQSIKc3RhdHNfZGYgfD4KICBmaWx0ZXIoZ2VuZV9zeW1ib2wgPT0gIlNOQ0EiKQpgYGAKCldlIGNhbiB1c2UgYGZpbHRlcigpYCBzaW1pbGFybHkgZm9yIG51bWVyaWMgc3RhdGVtZW50cy4KCmBgYHtyIGZpbHRlci1udW1lcmljLCBsaXZlID0gVFJVRX0KIyBIZXJlIGxldCdzIGZpbHRlciB0aGUgZGF0YSB0byByb3dzIHdpdGggYXZlcmFnZSBleHByZXNzaW9uIHZhbHVlcyBhYm92ZSA1MApzdGF0c19kZiB8PgogIGZpbHRlcihhdmdfZXhwcmVzc2lvbiA+IDUwKQpgYGAKCldlIGNhbiBhcHBseSBtdWx0aXBsZSBmaWx0ZXJzIGF0IG9uY2UsIHdoaWNoIHdpbGwgcmVxdWlyZSBhbGwgb2YgdGhlbSB0byBiZSBzYXRpc2ZpZWQgZm9yIGV2ZXJ5IHJvdyBpbiB0aGUgcmVzdWx0czoKCmBgYHtyIGZpbHRlci0yLCBsaXZlID0gVFJVRX0KIyBmaWx0ZXIgdG8gaGlnaGx5IGV4cHJlc3NlZCBnZW5lcyB3aXRoIGNvbnRyYXN0ICJtYWxlX2ZlbWFsZSIKc3RhdHNfZGYgfD4KICBmaWx0ZXIoY29udHJhc3QgPT0gIm1hbGVfZmVtYWxlIiwKICAgICAgICAgYXZnX2V4cHJlc3Npb24gPiA1MCkKYGBgCgpXaGVuIHdlIGFyZSBmaWx0ZXJpbmcsIHRoZSBgJWluJWAgb3BlcmF0b3IgY2FuIGNvbWUgaW4gaGFuZHkgaWYgd2UgaGF2ZSBtdWx0aXBsZSBpdGVtcyB3ZSB3b3VsZCBsaWtlIHRvIG1hdGNoLgpMZXQncyB0YWtlIGEgbG9vayBhdCB3aGF0IHVzaW5nIGAlaW4lYCBkb2VzLgoKYGBge3IgaW4tZXhhbXBsZSwgZXZhbCA9IEZBTFNFfQpnZW5lc19vZl9pbnRlcmVzdCA8LSBjKCJTTkNBIiwgIkNES04xQSIpCiMgQXJlIHRoZXNlIGdlbmVzIHByZXNlbnQgaW4gdGhlIGBnZW5lX3N5bWJvbGAgY29sdW1uIGluIHN0YXRzX2RmPwpzdGF0c19kZiRnZW5lX3N5bWJvbCAlaW4lIGdlbmVzX29mX2ludGVyZXN0CmBgYAoKYCVpbiVgIHJldHVybnMgYSBsb2dpY2FsIHZlY3RvciB0aGF0IG5vdyB3ZSBjYW4gdXNlIGluIGBkcGx5cjo6ZmlsdGVyYC4KCmBgYHtyIGZpbHRlci1pbiwgbGl2ZSA9IFRSVUV9CiMgZmlsdGVyIHRvIGtlZXAgb25seSBnZW5lcyBvZiBpbnRlcmVzdApzdGF0c19kZiB8PgogIGZpbHRlcihnZW5lX3N5bWJvbCAlaW4lIGMoIlNOQ0EiLCAiQ0RLTjFBIikpCmBgYAoKTGV0J3MgcmV0dXJuIHRvIG91ciBmaXJzdCBgZmlsdGVyKClgIGFuZCBidWlsZCBvbiB0byBpdC4KVGhpcyB0aW1lLCBsZXQncyBrZWVwIG9ubHkgc29tZSBvZiB0aGUgY29sdW1ucyBmcm9tIHRoZSBkYXRhIGZyYW1lIHVzaW5nIHRoZQpgc2VsZWN0KClgIGZ1bmN0aW9uLgpMZXQncyBhbHNvIHNhdmUgdGhpcyBhcyBhIG5ldyBkYXRhIGZyYW1lIGNhbGxlZCBgc3RhdHNfZmlsdGVyZWRfZGZgLgoKYGBge3IgZmlsdGVyLXNlbGVjdCwgbGl2ZSA9IFRSVUV9CiMgZmlsdGVyIHRvIGhpZ2hseSBleHByZXNzZWQgIm1hbGVfZmVtYWxlIgojIGFuZCBzZWxlY3QgZ2VuZV9zeW1ib2wsIGxvZ19mb2xkX2NoYW5nZSBhbmQgdF9zdGF0aXN0aWMKc3RhdHNfZmlsdGVyZWRfZGYgPC0gc3RhdHNfZGYgfD4KICBmaWx0ZXIoY29udHJhc3QgPT0gIm1hbGVfZmVtYWxlIiwKICAgICAgICAgYXZnX2V4cHJlc3Npb24gPiA1MCkgfD4KICBzZWxlY3QobG9nX2ZvbGRfY2hhbmdlLCB0X3N0YXRpc3RpYykKYGBgCgpMZXQncyBzYXkgd2Ugd2FudGVkIHRvIGFycmFuZ2UgdGhpcyBkYXRhc2V0IHNvIHRoYXQgdGhlIGdlbmVzIGFyZSBhcnJhbmdlZCBieSB0aGUgc21hbGxlc3QgcCB2YWx1ZXMgdG8gdGhlIGxhcmdlc3QuCkluIG9yZGVyIHRvIGRvIHRoaXMsIHdlIHdvdWxkIHVzZSB0aGUgZnVuY3Rpb24gYGFycmFuZ2UoKWAgYXMgd2VsbCBhcyB0aGUgY29sdW1uIHdlIHdvdWxkIGxpa2UgdG8gc29ydCBieSAoaW4gdGhpcyBjYXNlIGBwX3ZhbHVlYCkuCgpgYGB7ciBhcnJhbmdlfQpzdGF0c19kZiB8PgogIGFycmFuZ2UocF92YWx1ZSkKYGBgCgpXaGF0IGlmIHdlIHdhbnQgdG8gc29ydCBmcm9tIGxhcmdlc3QgdG8gc21hbGxlc3Q/Ckxpa2UgaWYgd2Ugd2FudCB0byBzZWUgdGhlIGdlbmVzIHdpdGggdGhlIGhpZ2hlc3QgYXZlcmFnZSBleHByZXNzaW9uPwpXZSBjYW4gdXNlIHRoZSBzYW1lIGZ1bmN0aW9uLCBidXQgaW5zdGVhZCB1c2UgdGhlIGBkZXNjKClgIGZ1bmN0aW9uIGFuZCBub3cgd2UgYXJlIHVzaW5nIGBhdmdfZXhwcmVzc2lvbmAgY29sdW1uLgoKYGBge3IgYXJyYW5nZS1kZXNjfQojIGFycmFuZ2UgZGVzY2VuZGluZyBieSBhdmdfZXhwcmVzc2lvbgpzdGF0c19kZiB8PgogIGFycmFuZ2UoZGVzYyhhdmdfZXhwcmVzc2lvbikpCmBgYAoKV2hhdCBpZiB3ZSB3b3VsZCBsaWtlIHRvIGNyZWF0ZSBhIG5ldyBjb2x1bW4gb2YgdmFsdWVzPwpGb3IgdGhhdCB3ZSB1c2UgYG11dGF0ZSgpYCBmdW5jdGlvbi4KCmBgYHtyIG11dGF0ZX0Kc3RhdHNfZGYgfD4KICBtdXRhdGUobG9nMTBfcF92YWx1ZSA9IC1sb2cxMChwX3ZhbHVlKSkKYGBgCgpXaGF0IGlmIHdlIHdhbnQgdG8gb2J0YWluIHN1bW1hcnkgc3RhdGlzdGljcyBmb3IgYSBjb2x1bW4gb3IgY29sdW1ucz8KVGhlIGBzdW1tYXJpemVgIGZ1bmN0aW9uIGFsbG93cyB1cyB0byBjYWxjdWxhdGUgc3VtbWFyeSBzdGF0aXN0aWNzIGZvciBhIGNvbHVtbi4KSGVyZSB3ZSB3aWxsIHVzZSBzdW1tYXJpemUgdG8gY2FsY3VsYXRlIHR3byBzdW1tYXJ5IHN0YXRpc3RpY3Mgb2YgbG9nLWZvbGQgY2hhbmdlIGFjcm9zcyBhbGwgZ2VuZXM6IG1lYW4gKGZ1bmN0aW9uIGBtZWFuKClgKSBhbmQgc3RhbmRhcmQgZGV2aWF0aW9uIChmdW5jdGlvbiBgc2QoKWApLgoKYGBge3Igc3VtbWFyaXplfQpzdGF0c19kZiB8PgogIHN1bW1hcml6ZShtZWFuKGxvZ19mb2xkX2NoYW5nZSksCiAgICAgICAgICAgIHNkKGxvZ19mb2xkX2NoYW5nZSkpCmBgYAoKV2hhdCBpZiB3ZSdkIGxpa2UgdG8gb2J0YWluIGEgc3VtbWFyeSBzdGF0aXN0aWNzIGJ1dCBoYXZlIHRoZW0gZm9yIHZhcmlvdXMgZ3JvdXBzPwpDb252ZW5pZW50bHkgbmFtZWQsIHRoZXJlJ3MgYSBmdW5jdGlvbiBjYWxsZWQgYGdyb3VwX2J5KClgIHRoYXQgc2VhbWxlc3NseSBhbGxvd3MgdXMgdG8gZG8gdGhpcy4KQWxzbyBub3RlIHRoYXQgYGdyb3VwX2J5KClgIGFsbG93cyB1cyB0byBncm91cCBieSBtdWx0aXBsZSB2YXJpYWJsZXMgYXQgYSB0aW1lIGlmIHlvdSB3YW50IHRvLgoKYGBge3Igc3VtbWFyaXplLWdyb3VwcywgbGl2ZSA9IFRSVUV9CnN0YXRzX3N1bW1hcnlfZGYgPC0gc3RhdHNfZGYgfD4KICAgICAgZ3JvdXBfYnkoY29udHJhc3QpIHw+CiAgICAgIHN1bW1hcml6ZShtZWFuKGxvZ19mb2xkX2NoYW5nZSksCiAgICAgICAgICAgICAgICBzZChsb2dfZm9sZF9jaGFuZ2UpKQpgYGAKCkxldCdzIGxvb2sgYXQgYSBwcmV2aWV3IG9mIHdoYXQgd2UgbWFkZToKCmBgYHtyfQpzdGF0c19zdW1tYXJ5X2RmCmBgYAoKSGVyZSB3ZSBoYXZlIHRoZSBtZWFuIGxvZyBmb2xkIGNoYW5nZSBleHByZXNzaW9uIHBlciBlYWNoIGNvbnRyYXN0IHdlIG1hZGUuCgojIyBBIGJyaWVmIGludHJvIHRvIHRoZSBgYXBwbHlgIGZhbWlseSBvZiBmdW5jdGlvbnMKCkluIGJhc2UgUiwgdGhlIGBhcHBseWAgZmFtaWx5IG9mIGZ1bmN0aW9ucyBjYW4gYmUgYW4gYWx0ZXJuYXRpdmUgbWV0aG9kcyBmb3IgcGVyZm9ybWluZyB0cmFuc2Zvcm1hdGlvbnMgYWNyb3NzIGEgZGF0YSBmcmFtZSwgbWF0cml4IG9yIG90aGVyIG9iamVjdCBzdHJ1Y3R1cmVzLgoKT25lIG9mIHRoaXMgZmFtaWx5IGlzIChzaG9ja2luZ2x5KSB0aGUgZnVuY3Rpb24gYGFwcGx5KClgLCB3aGljaCBvcGVyYXRlcyBvbiBtYXRyaWNlcy4KCkEgbWF0cml4IGlzIHNpbWlsYXIgdG8gYSBkYXRhIGZyYW1lIGluIHRoYXQgaXQgaXMgYSByZWN0YW5ndWxhciB0YWJsZSBvZiBkYXRhLCBidXQgaXQgaGFzIGFuIGFkZGl0aW9uYWwgY29uc3RyYWludDogcmF0aGVyIHRoYW4gZWFjaCBjb2x1bW4gaGF2aW5nIGEgdHlwZSwgQUxMIGRhdGEgaW4gYSBtYXRyaXggaGFzIHRoZSBzYW1lIHR5cGUuCgpUaGUgZmlyc3QgYXJndW1lbnQgdG8gYGFwcGx5KClgIGlzIHRoZSBkYXRhIG9iamVjdCB3ZSB3YW50IHRvIHdvcmsgb24uClRoZSB0aGlyZCBhcmd1bWVudCBpcyB0aGUgZnVuY3Rpb24gd2Ugd2lsbCBhcHBseSB0byBlYWNoIHJvdyBvciBjb2x1bW4gb2YgdGhlIGRhdGEgb2JqZWN0LgpUaGUgc2Vjb25kIGFyZ3VtZW50IGluIHNwZWNpZmllcyB3aGV0aGVyIHdlIGFyZSBhcHBseWluZyB0aGUgZnVuY3Rpb24gYWNyb3NzIHJvd3Mgb3IgYWNyb3NzIGNvbHVtbnMgKDEgZm9yIHJvd3MsIDIgZm9yIGNvbHVtbnMpLgoKUmVtZW1iZXIgdGhhdCBgZ2VuZV9kZmAgaXMgYSBnZW5lIHggc2FtcGxlIGdlbmUgZXhwcmVzc2lvbiBkYXRhIGZyYW1lIHRoYXQgaGFzIGNvbHVtbnMgb2YgdHdvIGRpZmZlcmVudCB0eXBlcywgY2hhcmFjdGVyIGFuZCBudW1lcmljLgpDb252ZXJ0aW5nIGl0IHRvIGEgbWF0cml4IHdpbGwgcmVxdWlyZSB1cyB0byBtYWtlIHRoZW0gYWxsIHRoZSBzYW1lIHR5cGUuCldlIGNhbiBjb2VyY2UgaXQgaW50byBhIG1hdHJpeCB1c2luZyBgYXMubWF0cml4KClgLCBpbiB3aGljaCBjYXNlIFIgd2lsbCBwaWNrIGEgdHlwZSB0aGF0IGl0IGNhbiBjb252ZXJ0IGV2ZXJ5dGhpbmcgdG8uCldoYXQgZG9lcyBpdCBjaG9vc2U/CgpgYGB7ciBtYXRyaXh9CiMgQ29lcmNlIGBnZW5lX2RmYCBpbnRvIGEgbWF0cml4CmdlbmVfbWF0cml4IDwtIGFzLm1hdHJpeChnZW5lX2RmKQpgYGAKCmBgYHtyIG1hdHJpeC10eXBlLCBsaXZlID0gVFJVRX0KIyBFeHBsb3JlIHRoZSBzdHJ1Y3R1cmUgb2YgdGhlIGBnZW5lX21hdHJpeGAgb2JqZWN0CnN0cihnZW5lX21hdHJpeCkKYGBgCgpXaGlsZSB0aGF0IHdvcmtlZCwgaXQgaXMgcmFyZSB0aGF0IHdlIHdhbnQgbnVtYmVycyBjb252ZXJ0ZWQgdG8gdGV4dCwgc28gd2UgYXJlIGdvaW5nIHRvIHNlbGVjdCBvbmx5IHRoZSBjb2x1bW5zIHdpdGggbnVtZXJpYyB2YWx1ZXMgYmVmb3JlIGNvbnZlcnRpbmcgaXQgdG8gYSBtYXRyaXguCldlIGNhbiBkbyB0aGlzIG1vc3QgZWFzaWx5IGJ5IHJlbW92aW5nIHRoZSBmaXJzdCBjb2x1bW4sIHdoaWNoIGNvbnRhaW5zIHRoZSBnZW5lIG5hbWVzIHN0b3JlZCBhcyBjaGFyYWN0ZXIgdmFsdWVzLgoKYGBge3IgbWF0cml4LW51bWVyaWMsIGxpdmUgPSBUUlVFfQojIExldCdzIHNhdmUgYSBuZXcgbWF0cml4IG9iamVjdCBuYW1lcyBgZ2VuZV9udW1fbWF0cml4YCBjb250YWluaW5nIG9ubHkKIyB0aGUgbnVtZXJpYyB2YWx1ZXMKZ2VuZV9udW1fbWF0cml4IDwtIGFzLm1hdHJpeChnZW5lX2RmWywgLTFdKQoKIyBFeHBsb3JlIHRoZSBzdHJ1Y3R1cmUgb2YgdGhlIGBnZW5lX251bV9tYXRyaXhgIG9iamVjdApzdHIoZ2VuZV9udW1fbWF0cml4KQpgYGAKCldoeSBkbyB3ZSBoYXZlIGEgYFssIC0xXWAgYWZ0ZXIgYGdlbmVfZGZgIGluIHRoZSBhYm92ZSBjaHVuaz8KCk5vdyB0aGF0IHRoZSBtYXRyaXggaXMgYWxsIG51bWJlcnMsIHdlIGNhbiBkbyB0aGluZ3MgbGlrZSBjYWxjdWxhdGUgdGhlIGNvbHVtbiBvciByb3cgc3RhdGlzdGljcyB1c2luZyBgYXBwbHkoKWAuCgpgYGB7ciByb3dtZWFuc30KIyBDYWxjdWxhdGUgcm93IG1lYW5zCmdlbmVfbWVhbnMgPC0gYXBwbHkoZ2VuZV9udW1fbWF0cml4LCAxLCBtZWFuKSAjIE5vdGljZSB3ZSBhcmUgdXNpbmcgMSBoZXJlCgojIEhvdyBsb25nIHdpbGwgYGdlbmVfbWVhbnNgIGJlPwpsZW5ndGgoZ2VuZV9tZWFucykKYGBgCgpOb3RlIHRoYXQgd2UgY2FuIG9idGFpbiB0aGUgc2FtZSByZXN1bHRzIGlmIHdlIHNlbGVjdCBqdXN0IHRoZSBjb2x1bW5zIHdpdGggbnVtZXJpYyB2YWx1ZXMgZnJvbSB0aGUgYGdlbmVfZGZgIGRhdGEgZnJhbWUuClRoaXMgYWxsb3dzIFIgdG8gZG8gdGhlIGFzLm1hdHJpeCgpIGNvZXJjaW9uIGF1dG9tYXRpY2FsbHksIGFuZCBjYW4gYmUgYSBoYW5keSBzaG9ydGN1dCBpZiB5b3UgaGF2ZSBhICptb3N0bHkqIG51bWVyaWMgZGF0YSBmcmFtZS4KCmBgYHtyIHJvd21lYW5zLWRhdGFmcmFtZX0KIyBDYWxjdWxhdGUgcm93IG1lYW5zIHVzaW5nIHRoZSBgZ2VuZV9kZmAgb2JqZWN0IGFmdGVyIHJlbW92aW5nIHRoZSBjaGFyYWN0ZXIgY29sdW1uCiMgYXBwbHkoKSBjb252ZXJ0cyB0aGlzIHRvIGEgbWF0cml4IGludGVybmFsbHkKZ2VuZV9tZWFuc19mcm9tX2RmIDwtIGFwcGx5KGdlbmVfZGZbLCAtMV0sIDEsIG1lYW4pCgojIExldCdzIGNoZWNrIHRoYXQgdGhlIHR3byBnZW5lIG1lYW5zIG9iamVjdHMgYXJlIGVxdWFsCmFsbC5lcXVhbChnZW5lX21lYW5zLCBnZW5lX21lYW5zX2Zyb21fZGYpCmBgYAoKTm93IGxldCdzIGludmVzdGlnYXRlIHRoZSBzYW1lIHNldCB1cCwgYnV0IHVzZSAyIHRvIGBhcHBseWAgb3ZlciB0aGUgY29sdW1ucyBvZiBvdXIgbWF0cml4LgoKYGBge3IgY29sbWVhbnN9CiMgQ2FsY3VsYXRlIHNhbXBsZSBtZWFucwpzYW1wbGVfbWVhbnMgPC0gYXBwbHkoZ2VuZV9udW1fbWF0cml4LCAyLCBtZWFuKSAjIE5vdGljZSB3ZSB1c2UgMiBoZXJlCgojIEhvdyBsb25nIHdpbGwgYHNhbXBsZV9tZWFuc2AgYmU/Cmxlbmd0aChzYW1wbGVfbWVhbnMpCmBgYAoKV2UgY2FuIHB1dCB0aGUgZ2VuZSBuYW1lcyBiYWNrIGludG8gdGhlIG51bWVyaWMgbWF0cml4IG9iamVjdCBieSBhc3NpZ25pbmcgdGhlbSBhcyByb3duYW1lcy4KCmBgYHtyIG1hdHJpeC1yb3duYW1lcywgbGl2ZSA9IFRSVUV9CiMgQXNzaWduIHRoZSBnZW5lIG5hbWVzIGZyb20gZ2VuZV9kZiRHZW5lIHRvIHRoZSBgZ2VuZV9udW1fbWF0cml4YCBvYmplY3QgdXNpbmcKIyB0aGUgYHJvd25hbWVzKClgIGZ1bmN0aW9uCnJvd25hbWVzKGdlbmVfbnVtX21hdHJpeCkgPC0gZ2VuZV9kZiRHZW5lCgojIEV4cGxvcmUgdGhlIGBnZW5lX251bV9tYXRyaXhgIG9iamVjdApoZWFkKGdlbmVfbnVtX21hdHJpeCkKYGBgCgpSb3cgbmFtZXMgbGlrZSB0aGlzIGNhbiBiZSB2ZXJ5IGNvbnZlbmllbnQgZm9yIGtlZXBpbmcgbWF0cmljZXMgb3JnYW5pemVkLCBidXQgcm93IG5hbWVzIChhbmQgY29sdW1uIG5hbWVzKSBjYW4gYmUgbG9zdCBvciBtaXNvcmRlcmVkIGlmIHlvdSBhcmUgbm90IGNhcmVmdWwsIGVzcGVjaWFsbHkgZHVyaW5nIGlucHV0IGFuZCBvdXRwdXQsIHNvIHRyZWF0IHRoZW0gd2l0aCBjYXJlLgoKQWx0aG91Z2ggdGhlIGBhcHBseWAgZnVuY3Rpb25zIG1heSBub3QgYmUgYXMgZWFzeSB0byB1c2UgYXMgdGhlIHRpZHl2ZXJzZSBmdW5jdGlvbnMsIGZvciBzb21lIGFwcGxpY2F0aW9ucywgYGFwcGx5YCBtZXRob2RzIGNhbiBiZSBiZXR0ZXIgc3VpdGVkLgpJbiB0aGlzIHdvcmtzaG9wLCB3ZSB3aWxsIG5vdCBkZWx2ZSB0b28gZGVlcGx5IGludG8gdGhlIHZhcmlvdXMgb3RoZXIgYXBwbHkgZnVuY3Rpb25zIChgdGFwcGx5KClgLCBgbGFwcGx5KClgLCBldGMuKSBidXQgeW91IGNhbiByZWFkIG1vcmUgaW5mb3JtYXRpb24gYWJvdXQgdGhlbSBbaGVyZV0oaHR0cHM6Ly93d3cuZ3VydTk5LmNvbS9yLWFwcGx5LXNhcHBseS10YXBwbHkuaHRtbCkuCgojIyBUaGUgZHBseXI6OmpvaW4gZnVuY3Rpb25zCgpMZXQncyBzYXkgd2UgaGF2ZSBhIHNjZW5hcmlvIHdoZXJlIHdlIGhhdmUgdHdvIGRhdGEgZnJhbWVzIHRoYXQgd2Ugd291bGQgbGlrZSB0byBjb21iaW5lLgpSZWNhbGwgdGhhdCBgc3RhdHNfZGZgIGFuZCBgZ2VuZV9kZmAgYXJlIGRhdGEgZnJhbWVzIHRoYXQgY29udGFpbiBpbmZvcm1hdGlvbiBhYm91dCBzb21lIG9mIHRoZSBzYW1lIGdlbmVzLgpUaGUgW2BkcGx5cjo6am9pbmAgZmFtaWx5IG9mIGZ1bmN0aW9uc10oaHR0cHM6Ly9kcGx5ci50aWR5dmVyc2Uub3JnL3JlZmVyZW5jZS9tdXRhdGUtam9pbnMuaHRtbCkgYXJlIHVzZWZ1bCBmb3IgdmFyaW91cyBzY2VuYXJpb3Mgb2YgY29tYmluaW5nIGRhdGEgZnJhbWVzLgpGb3IgYSB2aXN1YWwgZXhwbGFuYXRpb24sIHRoZSBbYHRpZHlleHBsYWluYCBwcm9qZWN0XShodHRwczovL2dpdGh1Yi5jb20vZ2FkZW5idWllL3RpZHlleHBsYWluKSBoYXMgc29tZSBbaGVscGZ1bCBhbmltYXRpb25zIG9mIGpvaW5zXShodHRwczovL2dpdGh1Yi5jb20vZ2FkZW5idWllL3RpZHlleHBsYWluI211dGF0aW5nLWpvaW5zKS4KCkZvciBub3csIHdlIHdpbGwgZm9jdXMgb24gYGlubmVyX2pvaW4oKWAsIHdoaWNoIHdpbGwgY29tYmluZSBkYXRhIGZyYW1lcyBieSBvbmx5IGtlZXBpbmcgaW5mb3JtYXRpb24gYWJvdXQgbWF0Y2hpbmcgcm93cyB0aGF0IGFyZSBpbiBib3RoIGRhdGEgZnJhbWVzLgpXZSBuZWVkIHRvIHVzZSB0aGUgYGJ5YCBhcmd1bWVudCB0byBkZXNpZ25hdGUgd2hhdCBjb2x1bW4ocykgc2hvdWxkIGJlIHVzZWQgYXMgYSBrZXkgdG8gbWF0Y2ggdGhlIGRhdGEgZnJhbWVzLgpJbiB0aGlzIGNhc2Ugd2Ugd2FudCB0byBtYXRjaCB0aGUgZ2VuZSBpbmZvcm1hdGlvbiBiZXR3ZWVuIHRoZSB0d28sIHNvIHdlIHdpbGwgc3BlY2lmeSB0aGF0IHdlIHdhbnQgdG8gY29tcGFyZSB2YWx1ZXMgaW4gdGhlIGBlbnNlbWJsX2lkYCBjb2x1bW4gZnJvbSBgc3RhdHNfZGZgIHRvIHRoZSBgR2VuZWAgY29sdW1uIGZyb20gYGdlbmVfZGZgLgoKYGBge3IgaW5uZXItam9pbn0Kc3RhdHNfZGYgfD4KICAjIEpvaW4gYmFzZWQgb24gdGhlaXIgc2hhcmVkIGNvbHVtbgogICMgQ2FsbGVkIGVuc2VtYmxfaWQgaW4gc3RhdHNfZGYgYW5kIGNhbGxlZCBHZW5lIGluIGdlbmVfZGYKICBpbm5lcl9qb2luKGdlbmVfZGYsIGJ5ID0gYygnZW5zZW1ibF9pZCcgPSAnR2VuZScpKQpgYGAKCiMjIFNhdmUgZGF0YSB0byBmaWxlcwoKIyMjIyBTYXZlIHRvIFRTViBmaWxlcwoKTGV0J3Mgd3JpdGUgc29tZSBvZiB0aGUgZGF0YSBmcmFtZXMgd2UgY3JlYXRlZCB0byBhIGZpbGUuClRvIGRvIHRoaXMsIHdlIGNhbiB1c2UgdGhlIGByZWFkcmAgbGlicmFyeSBvZiBgd3JpdGVfKClgIGZ1bmN0aW9ucy4KVGhlIGZpcnN0IGFyZ3VtZW50IG9mIGB3cml0ZV90c3YoKWAgaXMgdGhlIGRhdGEgd2Ugd2FudCB0byB3cml0ZSwgYW5kIHRoZSBzZWNvbmQgYXJndW1lbnQgaXMgYSBjaGFyYWN0ZXIgc3RyaW5nIHRoYXQgZGVzY3JpYmVzIHRoZSBwYXRoIHRvIHRoZSBuZXcgZmlsZSB3ZSB3b3VsZCBsaWtlIHRvIGNyZWF0ZS4KUmVtZW1iZXIgdGhhdCB3ZSBjcmVhdGVkIGEgYHJlc3VsdHNgIGRpcmVjdG9yeSB0byBwdXQgb3VyIG91dHB1dCBpbiwgYnV0IGlmIHdlIHdhbnQgdG8gc2F2ZSBvdXIgZGF0YSB0byBhIGRpcmVjdG9yeSBvdGhlciB0aGFuIG91ciB3b3JraW5nIGRpcmVjdG9yeSwgd2UgbmVlZCB0byBzcGVjaWZ5IHRoaXMuClRoaXMgaXMgd2hhdCB3ZSB3aWxsIHVzZSB0aGUgYGZpbGUucGF0aCgpYCBmdW5jdGlvbiBmb3IuCkxldCdzIGxvb2sgaW4gYSBiaXQgbW9yZSBkZXRhaWwgd2hhdCBgZmlsZS5wYXRoKClgIGRvZXMsIGJ5IGV4YW1pbmluZyB0aGUgcmVzdWx0cyBvZiB0aGUgZnVuY3Rpb24gaW4gdGhlIGV4YW1wbGVzIGJlbG93LgoKYGBge3IgZmlsZS1wYXRoLXF1aXp9CiMgV2hpY2ggb2YgdGhlc2UgZmlsZSBwYXRocyBpcyB3aGF0IHdlIHdhbnQgdG8gdXNlIHRvIHNhdmUgb3VyIGRhdGEgdG8gdGhlCiMgcmVzdWx0cyBkaXJlY3Rvcnkgd2UgY3JlYXRlZCBhdCB0aGUgYmVnaW5uaW5nIG9mIHRoaXMgbm90ZWJvb2s/CmZpbGUucGF0aCgiZG9ja2VyLWluc3RhbGwiLCAic3RhdHNfc3VtbWFyeS50c3YiKQpmaWxlLnBhdGgoInJlc3VsdHMiLCAic3RhdHNfc3VtbWFyeS50c3YiKQpmaWxlLnBhdGgoInN0YXRzX3N1bW1hcnkudHN2IiwgInJlc3VsdHMiKQpgYGAKClJlcGxhY2UgYDxORVdfRklMRV9QQVRIPmAgYmVsb3cgd2l0aCB0aGUgYGZpbGUucGF0aCgpYCBzdGF0ZW1lbnQgZnJvbSBhYm92ZSB0aGF0IHdpbGwgc3VjY2Vzc2Z1bGx5IHNhdmUgb3VyIGZpbGUgdG8gdGhlIGByZXN1bHRzYCBmb2xkZXIuCgpgYGB7ciBldmFsPUZBTFNFfQojIFdyaXRlIG91ciBkYXRhIGZyYW1lIHRvIGEgVFNWIGZpbGUKcmVhZHI6OndyaXRlX3RzdihzdGF0c19zdW1tYXJ5X2RmLCA8TkVXX0ZJTEVfUEFUSD4pCmBgYAoKQ2hlY2sgaW4geW91ciBgcmVzdWx0c2AgZGlyZWN0b3J5IHRvIHNlZSBpZiB5b3VyIG5ldyBmaWxlIGhhcyBzdWNjZXNzZnVsbHkgc2F2ZWQuCgojIyMjIFNhdmUgdG8gUkRTIGZpbGVzCgpGb3IgdGhpcyBleGFtcGxlIHdlIGhhdmUgYmVlbiB3b3JraW5nIHdpdGggZGF0YSBmcmFtZXMsIHdoaWNoIGFyZSBjb252ZW5pZW50bHkgcmVwcmVzZW50ZWQgYXMgVFNWIG9yIENTViB0YWJsZXMuCkhvd2V2ZXIsIGluIG90aGVyIHNpdHVhdGlvbnMgd2UgbWF5IHdhbnQgdG8gc2F2ZSBtb3JlIGNvbXBsaWNhdGVkIG9yIHZlcnkgbGFyZ2UgZGF0YSBzdHJ1Y3R1cmVzLCBSRFMgKFIgRGF0YSBTZXJpYWxpemVkL1NpbmdsZSkgZmlsZXMgbWF5IGJlIGEgYmV0dGVyIG9wdGlvbiBmb3Igc2F2aW5nIG91ciBkYXRhLgpSRFMgaXMgUidzIHNwZWNpYWwgZmlsZSBmb3JtYXQgZm9yIGhvbGRpbmcgZGF0YSBleGFjdGx5IGFzIHlvdSBoYXZlIGl0IGluIHlvdXIgUiBlbnZpcm9ubWVudC4KUkRTIGZpbGVzIGNhbiBhbHNvIGJlIGNvbXByZXNzZWQsIG1lYW5pbmcgdGhleSB3aWxsIHRha2UgdXAgbGVzcyBzcGFjZSBvbiB5b3VyIGNvbXB1dGVyLgpMZXQncyBzYXZlIG91ciBkYXRhIHRvIGFuIFJEUyBmaWxlIGluIG91ciBgcmVzdWx0c2AgZm9sZGVyLgpZb3Ugd2lsbCBuZWVkIHRvIHJlcGxhY2UgdGhlIGAudHN2YCB3aXRoIGAuUkRTYCwgYnV0IHlvdSBjYW4gdXNlIHdoYXQgd2UgZGV0ZXJtaW5lZCBhcyBvdXIgZmlsZSBwYXRoIGZvciB0aGUgbGFzdCBjaHVuayBhcyB5b3VyIHRlbXBsYXRlLgoKYGBge3IgZXZhbD1GQUxTRX0KIyBXcml0ZSB5b3VyIG9iamVjdCB0byBhbiBSRFMgZmlsZQpyZWFkcjo6d3JpdGVfcmRzKHN0YXRzX3N1bW1hcnlfZGYsIDxQVVRfQ09SUkVDVF9GSUxFX1BBVEhfSEVSRT4pCmBgYAoKIyMjIyBSZWFkIGFuIFJEUyBmaWxlCgpTaW5jZSBub3cgeW91IGhhdmUgbGVhcm5lZCB0aGUgYHJlYWRyYCBmdW5jdGlvbnM6IGByZWFkX3RzdigpYCwgYHdyaXRlX3RzdigpYCwgYW5kIG5vdywgYHdyaXRlX3JkcygpYCwgd2hhdCBkbyB5b3Ugc3VwcG9zZSB0aGUgZnVuY3Rpb24geW91IHdpbGwgbmVlZCB0byByZWFkIHlvdXIgUkRTIGZpbGUgaXMgY2FsbGVkPwpVc2UgdGhhdCBmdW5jdGlvbiBoZXJlIHRvIHJlLWltcG9ydCB5b3VyIGRhdGEgaW4gdGhlIGNodW5rIHdlIHNldCB1cCBmb3IgeW91IGJlbG93LgoKYGBge3IgZXZhbD1GQUxTRX0KIyBSZWFkIGluIHlvdXIgUkRTIGZpbGUKcmVpbXBvcnRfZGYgPC0gPFBVVF9GVU5DVElPTl9OQU1FPihmaWxlLnBhdGgoInJlc3VsdHMiLCAic3RhdHNfc3VtbWFyeS5SRFMiKSkKYGBgCgpBcyBpcyBnb29kIHByYWN0aWNlLCB3ZSB3aWxsIGVuZCB0aGlzIHNlc3Npb24gYnkgcHJpbnRpbmcgb3V0IG91ciBzZXNzaW9uIGluZm8uCgojIyMgU2Vzc2lvbiBJbmZvCgpgYGB7cn0KIyBQcmludCBvdXQgdGhlIHZlcnNpb25zIGFuZCBwYWNrYWdlcyB3ZSBhcmUgdXNpbmcgaW4gdGhpcyBzZXNzaW9uCnNlc3Npb25JbmZvKCkKYGBgCg==
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/00-scRNA_introduction.html b/completed-notebooks/scRNA-seq/00-scRNA_introduction.html new file mode 100644 index 0000000..a975651 --- /dev/null +++ b/completed-notebooks/scRNA-seq/00-scRNA_introduction.html @@ -0,0 +1,1677 @@ + + + + + + + + + + + + + + + +Introduction to single-cell RNA-seq + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + +
+

Single-cell RNA-seq Technologies

+

Single-cell RNA-seq (scRNA-seq) technologies can be divided into two +categories, tag-based and full-length, based on their capture methods +and quantitative nature.

+

In tag-based scRNA-seq, cells are separated by +emulsion/droplets, and individual cells are given a unique cell barcode +prior to sequencing. An example of tag-based scRNA-seq is 10x Genomics +(Zheng et +al. 2017).

+

In full-length scRNA-seq, cells are physically separated +into individual wells of a plate and are often also sorted by other +means (e.g., Fluorescence Activated Cell Sorting). With full-length +scRNA-seq, each cell is sequenced individually and has its own fastq +file. An example of full-length scRNA-seq is Smart-seq2 (Picelli et +al. 2014).

+

For the purposes of this tutorial, we will focus on tag-based +scRNA-seq, but it is important to keep in mind that the pre-processing +steps and the biases to look out for in post-processing vary based on +technology and how the cells are sorted.

+

For more extensive background on single-cell experimental methods, +Predeus et al. also have a very good tutorial for scRNA-seq. We +will also refer extensively to the the book Orchestrating +Single-Cell Analysis with Bioconductor (Amezquita et +al.).

+
+Overall view of scRNA-seq tag-based workflow +
Overall view of scRNA-seq tag-based +workflow
+
+
+

Tag-based scRNA-seq

+

Example: 10x Genomics (Zheng et +al. 2017) Individual cells are separated by emulsion/droplets +prior to cell lysis. Transcripts from each cell are then tagged with two +barcodes: a cell-specific barcode and a Unique Molecular Identifier +(UMI) (Islam et +al. 2014). All transcripts from all cells are then pooled +together and undergo PCR amplification and sequencing as if they are one +sample.

+

Tagging of each transcript with a different UMI before amplification +allows the identification of PCR duplicates, allowing control for PCR +amplification errors and biases. Individual samples have two fastq +files: one for the cell and UMI barcodes (R1) and another with the +transcript sequence reads (R2).

+
+

Pros

+
    +
  • Can run potentially millions of cells at once.
  • +
  • Much less computationally demanding.
  • +
  • Won’t take up all your computer’s storage.
  • +
  • Much cheaper.
  • +
+
+
+

Cons

+
    +
  • Sequencing is not bidirectional so data will likely have more +intense 3’ bias.
  • +
  • The sequencing depth per cell with these technologies is generally +lower.
  • +
+
+
+
+
+

Resources

+ +
+

Literature on the comparisons and explanations of scRNA-seq +technologies

+ +
+ +
+ + + +
+
+ +
+ + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/01-scRNA_quant_qc.nb.html b/completed-notebooks/scRNA-seq/01-scRNA_quant_qc.nb.html new file mode 100644 index 0000000..512f3d9 --- /dev/null +++ b/completed-notebooks/scRNA-seq/01-scRNA_quant_qc.nb.html @@ -0,0 +1,3352 @@ + + + + + + + + + + + + + + + +Processing tag-based single-cell RNA-seq data with Alevin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Navigate the terminal interface
  • +
  • Quantify single cell expression data with Alevin
  • +
  • Perform basic quality control and interpret results
  • +
+
+

In this notebook, we will be running through the basics of processing +raw single-cell RNA-seq data.

+

We will be using a tag-based scRNA-seq sample from the Tabula +Muris project. This dataset is made of 20 mouse organs that +were sequenced using 10x Genomics Chromium single cell sequencing +methods. For 10x Genomics scRNA-seq data, cells are separated by +emulsion/droplets, and individual cells are given barcodes (often +abbreviated ‘CB’ in documentation). Each transcript will also contain a +Unique +Molecular Identifiers (UMIs) which allow us to examine PCR +amplification errors and biases.

+
+Roadmap: Preprocessing & Initial QC +
Roadmap: Preprocessing & Initial QC
+
+
+
+

About the data

+

We obtained these data from Tabula Muris project’s Figshare. +The BAM files that were on Figshare were converted to fastq +files using bamtofastq +from 10x Genomics. We will process a fastq file from mouse +bladder for this as an example.
+To limit the amount of time this takes to run in the context of this +workshop, we are only running part of the sample’s reads.

+

Note: Depending on the format of the data you are working +with, i.e., if you have a set of .bcl files, you may need +to use cellranger mkfastq +to create .fastq files for each sample. However, most +public data is available in fastq format, and most +sequencing cores will generate the .fastq files, so that is +where we will start.

+
+
+

Checking directories and files

+

If you have opened the scRNA-seq.Rproj file, your +Terminal should already be set to the scRNA-seq directory, +but it is worth checking with the pwd command in the +Terminal (or by looking at the path shown in the command prompt or at +the top of the Terminal pane).

+

If you are in a different directory, we will want to use +cd to change to the training-modules/scRNA-seq +directory.

+

Copy and paste the text in the code blocks below into your +Terminal window in RStudio. It should be in the lower left +hand corner as a tab next to Console.

+
cd ~/training-modules/scRNA-seq
+

Once you are there, you should be able to run the following command +in the Terminal to look at the contents of the +data/tabula-muris directory:

+
ls data/tabula-muris
+

Here you will see the fastq directory, which is actually +a link to a shared folder with the raw fastq files, split by sample. We +will use these files, but we will not write to this directory.

+

We can look inside the contents of the fastq directory, +and we should see 16 subfolders corresponding to 16 different samples. +Within each of these folders should be the fastq files +associated with that sample. Again, we can use the ls +command to show the contents of each of the directories.

+

In this scenario, 10X_P4_3 refers to the sample name +that we will be processing, which contains data from mouse bladder +cells.

+
ls data/tabula-muris/fastq/10X_P4_3
+

You should see a list of multiple fastq files all +starting with 10X_P4_3, indicating the sample name.

+

If you notice, each fastq file name contains either +R1 or R2. These correspond to the two +sequencing reads of a paired-end library. For 10x data, the first read +(the R1 file) will contain the cell barcode and UMI +sequence, and the second read (the R2 file) will contain a +cDNA sequence corresponding to the captured transcript. We will need +both of these files to quantify our data.

+
10X_P4_3_L001_R1_001.fastq.gz  10X_P4_3_L002_R1_001.fastq.gz
+10X_P4_3_L001_R1_002.fastq.gz  10X_P4_3_L002_R1_002.fastq.gz
+10X_P4_3_L001_R1_003.fastq.gz  10X_P4_3_L002_R1_003.fastq.gz
+10X_P4_3_L001_R2_001.fastq.gz  10X_P4_3_L002_R2_001.fastq.gz
+10X_P4_3_L001_R2_002.fastq.gz  10X_P4_3_L002_R2_002.fastq.gz
+10X_P4_3_L001_R2_003.fastq.gz  10X_P4_3_L002_R2_003.fastq.gz
+

Sequencing runs are often split into multiple fastq +files, both when a sample was run across multiple lanes and to keep the +individual file sizes down. This was the case for the Tabula +Muris data we are using here as well. The files that you see in the +data/tabula-muris/fastq/10X_P4_3 directory shown above +represent 2 lanes worth of data, with three R1 and three R2 files per +lane.

+

You will also see the file TM_droplet_metadata.csv, +which contains metadata for the Tabula Muris experiments.

+
+

Set up output directory

+

Now that we are in scRNA-seq, we’ll make a directory for +us to store our quantification files in. In Terminal, run +the following command:

+
mkdir -p data/tabula-muris/alevin-quant/10X_P4_3_subset
+
+
+
+

Quantifying cell expression with Salmon Alevin

+

Alevin +is run from the command line (Terminal) to perform mapping and +quantification of tag-based single cell expression data.

+
+

Indexing the mouse transcriptome

+

Before you can quantify with Salmon and Alevin +we need to index the transcriptome for the species we will be mapping +to. This step would be the same for mapping bulk RNA-seq data, and you +can use the same transcriptome indexes as bulk RNA-seq, however, due to +the shorter read lengths in the 10x sequencing, we may want to use +shorter kmers than the default index size that salmon uses. In this +instance, we used a -k of 23.

+

In the interest of time, we have already run the command below and +have the index built and ready for you in a shared directory.

+

But for your own reference, here is how you might do it yourself:

+
# salmon --threads=16 --no-version-check index \
+#  -t Mus_musculus.GRCm38.cdna.all.fa.gz \
+#  -i index/Mus_musculus/short_index \
+#  -k 23
+

Scripts to build the indexes like those we are using here (and +others) can be found in this +repository.

+
+
+

Running Salmon Alevin

+

Copy and paste this in your Terminal to run the Alevin +quantification. This will take about 20 minutes to run, so we will start +now, then talk about the options.

+

Note that here we are only giving the full paths to one of the +R1 files and one of the R2 files. For the sake +of time, we are only going to be running this on a subset of reads, but +will also show you how to run it on the full sample.

+
salmon alevin \
+  -i index/Mus_musculus/short_index \
+  -l ISR \
+  -1 data/tabula-muris/fastq/10X_P4_3/10X_P4_3_L001_R1_001.fastq.gz \
+  -2 data/tabula-muris/fastq/10X_P4_3/10X_P4_3_L001_R2_001.fastq.gz \
+  -o data/tabula-muris/alevin-quant/10X_P4_3_subset \
+  -p 4 \
+  --chromium  \
+  --tgMap index/Mus_musculus/Mus_musculus.GRCm38.95.versioned_tx2gene.tsv \
+  --dumpFeatures
+
+
+

Salmon Alevin command line options

+

For detailed information about all options available see the Alevin +documentation and Salmon +documentation.

+

Many of the options for the salmon alevin command are +the same as those you would see when mapping and quantifying bulk +RNA-seq data with salmon quant:

+
    +
  • -i gives the location of the transcriptome index
  • +
  • -1 and -2 are the paired read input +files
  • +
  • -o designates the output folder
  • +
  • -p allows us to specify how many processors to use; in +this case we will use 4
  • +
+
+

-l

+

The -l option is for designating the library format. For +most single-cell quantification, you will want to use the +ISR library type. See Salmon’s +documentation for more information on fragment library types (and +all of the other options available). Note that this option must come +before the read files.

+
+
+

--chromium

+

Because we are using 10x v2 chromium data, we have to use this flag +to tell alevin where to expect the barcodes, UMIs and +sequence data. If we were using 10x v3 data, we would need the +--chromiumV3 flag instead. Drop-seq data is also supported, +for which we would use the --dropseq flag instead of +this.

+
+
+

--tgMap

+

The transcriptome file that we are mapping to has separate sequences +for each transcript of a gene, but due to the sparse nature of +single-cell data, we are not likely to be able to meaningfully +distinguish among different transcripts. For this reason, +alevin will quantify our results at the gene level, so we +need to provide a file that maps each transcript to its gene. For this +example, we’ve pre-made the file +Mus_musculus.GRCm38.95.versioned_tx2gene.tsv from the +Ensembl transcriptome that we indexed above. The file is a TSV +(tab-separated values) file with 2 columns: one of transcripts and the +other the gene that each comes from.

+
+
+

--dumpFeatures

+

This option will print out information that we will use for quality +checks later on, including files with information on the UMIs and cell +barcodes.

+

See the Alevin +documentation for a complete list of the Alevin options. There are +also a number of example analyses at the Alevin +tutorial website.

+
+
+
+

Note: Running the FULL sample.

+

When we took a look at the +data/tabula-muris/fastq/10X_P4_3 directory earlier, we +noticed that there were multiple files representing 2 lanes worth of +data, with three R1 and three R2 files per lane:

+

We should really run all of these through Salmon, though that will +take about six times as long as the single pair of reads we used. To do +this, we could list each R1 and R2 file (space separated) after the +-1 and -2 arguments, respectively. But that is +a lot of typing, so a nice shortcut is to use a * character +to represent a wildcard that will be filled in with whatever characters +are present in the files at the given path. In the pattern +10X_P4_3_L*_R1_*.fastq.gz, this would allow any lane number +and any subset, so would match all of the following files (all the R1 +files in this case):

+
10X_P4_3_L001_R1_001.fastq.gz  10X_P4_3_L002_R1_001.fastq.gz
+10X_P4_3_L001_R1_002.fastq.gz  10X_P4_3_L002_R1_002.fastq.gz
+10X_P4_3_L001_R1_003.fastq.gz  10X_P4_3_L002_R1_003.fastq.gz
+

For this directory, that would make our full +salmon alevin command look like this (don’t run this +now!):

+
# salmon alevin \
+#   -i index/Mus_musculus/short_index \
+#   -l ISR \
+#   -1 data/tabula-muris/fastq/10X_P4_3/10X_P4_3_L*_R1_*.fastq.gz \
+#   -2 data/tabula-muris/fastq/10X_P4_3/10X_P4_3_L*_R2_*.fastq.gz \
+#   -o data/tabula-muris/alevin-quant/10X_P4_3 \
+#   -p 4 \
+#   --chromium  \
+#   --tgMap index/Mus_musculus/Mus_musculus.GRCm38.95.versioned_tx2gene.tsv \
+#   --dumpFeatures
+

In general, you will want to run all lanes and all files for a given +sample together. But DO NOT combine multiple +samples into a single alevin quantification! Keep +separate samples (and replicates) separate!

+
+
+
+

Initial quality control with alevinQC

+

Now that we have quantified our data with Alevin, we are ready to +perform initial quality control checks.

+

In order to perform these quality control checks, we’ll use the +alevinQC R package. Note that alevinQC depends +on files that we get using the--dumpFeatures option in +Alevin.

+

About the alevinQCReport() function: The first argument +needs to be where the sample’s output data was put when Alevin was run +(as a character string, aka using quotes). The rest of +alevinQCReport()’s arguments tell R where to put the output +QC report.

+ + + +
# First, define path to alevin output:
+alevin_path <- file.path("data", "tabula-muris", "alevin-quant", "10X_P4_3_subset")
+
+# Produce a QC report of results found in the `alevin_path` directory
+alevinQC::alevinQCReport(alevin_path,
+                         sampleId = "10X_P4_3_subset",
+                         outputFile = "10X_P4_3_subset-qc_report.html",
+                         outputDir = "qc-reports")
+ + + +

Look for the 10X_P4_3_subset-qc_report.html file created +in the qc-reports directory to examine the quality of your +data and performance of Alevin.

+

We have also placed an example of a poor quality sample alevinQC +report in the qc-reports directory, with the name +Bad_Example_10X_P4_2_qc_report.html.

+

This report will show a few key metrics that inform you about the +quality of your sample. There is a lot of information included in the +report, so some key metrics to note are included in the +Summary tables:

+
    +
  • Fraction of reads in whitelist barcodes
  • +
  • Mean number of reads per cell
  • +
  • Median number of detected genes per cell
  • +
+

The fraction of reads in whitelist barcodes is particularly important +as a low percentage here means the library contains many reads that do +not contain the expected cell barcodes. This is indicative of poor +single-cell capture during library construction.

+

The mean number of reads per cell and median number of detected genes +per cell can be helpful in understanding how deeply the library was +sequenced. The higher these numbers are, the more information you will +obtain per cell.

+

The knee plot shows the number of distinct UMIs for +each possible cell barcode on the y-axis, with the barcodes ranked from +the most UMIs to the fewest along the x-axis.

+

Cell barcodes with low UMI counts are likely to be empty droplets +that did not contain a cell. These droplets must be filtered out so we +only consider true cells for downstream analysis.

+

To do this, we can look for a “knee” on the curve where the number of +UMIs per barcode starts to drop off rapidly, with the intuition that +this is where we are reaching the end of the UMIs per cell distribution +for true cells. We can then choose a threshold below the knee and only +include barcodes above this threshold in the final cell barcode +list.

+

This “knee” method, which is implemented by alevin, is +fairly effective and does not require any read mapping or quantification +before filtering. More recent versions of Cell Ranger use a somewhat +different method based on the “empty drops” method of Lun et al. +(2019), that is applied after initial gene quantification. This +allows filtering to retain cells with low counts that are nonetheless +likely to represent real cells.

+
+
+

Next steps: Loading Alevin output into R

+

After we have successfully quantified our tag-based scRNA-seq data +(and done some QC), we will want to read it into R to start to analyze +it. The easiest way to do this is to use the tximeta +package, which we will introduce in the next notebook.

+
+
+

Session Info

+ + + +
sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats     graphics  grDevices utils     datasets  methods   base     
+
+other attached packages:
+[1] optparse_1.7.5
+
+loaded via a namespace (and not attached):
+ [1] digest_0.6.35     R6_2.5.1          fastmap_1.1.1     xfun_0.43        
+ [5] magrittr_2.0.3    cachem_1.0.8      getopt_1.20.4     glue_1.7.0       
+ [9] stringr_1.5.1     knitr_1.46        htmltools_0.5.8.1 rmarkdown_2.26   
+[13] lifecycle_1.0.4   cli_3.6.2         sass_0.4.9        vctrs_0.6.5      
+[17] jquerylib_0.1.4   compiler_4.4.0    tools_4.4.0       bslib_0.7.0      
+[21] evaluate_0.23     yaml_2.3.8        jsonlite_1.8.8    rlang_1.1.3      
+[25] stringi_1.8.3    
+ + +
+ +
LS0tCnRpdGxlOiAiUHJvY2Vzc2luZyB0YWctYmFzZWQgc2luZ2xlLWNlbGwgUk5BLXNlcSBkYXRhIHdpdGggQWxldmluIgphdXRob3I6IENDREwgZm9yIEFMU0YKZGF0ZTogMjAyMQpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazogCiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCiMjIE9iamVjdGl2ZXMKClRoaXMgbm90ZWJvb2sgd2lsbCBkZW1vbnN0cmF0ZSBob3cgdG86CgotIE5hdmlnYXRlIHRoZSB0ZXJtaW5hbCBpbnRlcmZhY2UKLSBRdWFudGlmeSBzaW5nbGUgY2VsbCBleHByZXNzaW9uIGRhdGEgd2l0aCBBbGV2aW4gCi0gUGVyZm9ybSBiYXNpYyBxdWFsaXR5IGNvbnRyb2wgYW5kIGludGVycHJldCByZXN1bHRzIAoKLS0tCgpJbiB0aGlzIG5vdGVib29rLCB3ZSB3aWxsIGJlIHJ1bm5pbmcgdGhyb3VnaCB0aGUgYmFzaWNzIG9mIHByb2Nlc3NpbmcgcmF3IHNpbmdsZS1jZWxsIFJOQS1zZXEgZGF0YS4KCldlIHdpbGwgYmUgdXNpbmcgYSB0YWctYmFzZWQgc2NSTkEtc2VxIHNhbXBsZSBmcm9tIHRoZSBbKlRhYnVsYSBNdXJpcyogcHJvamVjdF0oaHR0cHM6Ly93d3cubmF0dXJlLmNvbS9hcnRpY2xlcy9zNDE1ODYtMDE4LTA1OTAtNCkuClRoaXMgZGF0YXNldCBpcyBtYWRlIG9mIDIwIG1vdXNlIG9yZ2FucyB0aGF0IHdlcmUgc2VxdWVuY2VkIHVzaW5nIDEweCBHZW5vbWljcyBDaHJvbWl1bSBzaW5nbGUgY2VsbCBzZXF1ZW5jaW5nIG1ldGhvZHMuCkZvciAxMHggR2Vub21pY3Mgc2NSTkEtc2VxIGRhdGEsIGNlbGxzIGFyZSBzZXBhcmF0ZWQgYnkgZW11bHNpb24vZHJvcGxldHMsIGFuZCBpbmRpdmlkdWFsIGNlbGxzIGFyZSBnaXZlbiBiYXJjb2RlcyAob2Z0ZW4gYWJicmV2aWF0ZWQgJ0NCJyBpbiBkb2N1bWVudGF0aW9uKS4KRWFjaCB0cmFuc2NyaXB0IHdpbGwgYWxzbyBjb250YWluIGEgW1VuaXF1ZSBNb2xlY3VsYXIgSWRlbnRpZmllcnMgKFVNSXMpXShodHRwOi8vd3d3Lm5hdHVyZS5jb20vZG9pZmluZGVyLzEwLjEwMzgvbm1ldGguMjc3Mikgd2hpY2ggYWxsb3cgdXMgdG8gZXhhbWluZSBQQ1IgYW1wbGlmaWNhdGlvbiBlcnJvcnMgYW5kIGJpYXNlcy4KCiFbUm9hZG1hcDogUHJlcHJvY2Vzc2luZyAmIEluaXRpYWwgUUNdKGRpYWdyYW1zL3JvYWRtYXBfc2luZ2xlX3ByZXByb2Nlc3NfcWMucG5nKQoKIyMgQWJvdXQgdGhlIGRhdGEKCldlIG9idGFpbmVkIHRoZXNlIGRhdGEgZnJvbSBUYWJ1bGEgTXVyaXMgcHJvamVjdCdzIFtGaWdzaGFyZV0oaHR0cHM6Ly9maWdzaGFyZS5jb20vcHJvamVjdHMvVGFidWxhX011cmlzX1RyYW5zY3JpcHRvbWljX2NoYXJhY3Rlcml6YXRpb25fb2ZfMjBfb3JnYW5zX2FuZF90aXNzdWVzX2Zyb21fTXVzX211c2N1bHVzX2F0X3NpbmdsZV9jZWxsX3Jlc29sdXRpb24vMjc3MzMpLgpUaGUgQkFNIGZpbGVzIHRoYXQgd2VyZSBvbiBGaWdzaGFyZSB3ZXJlIGNvbnZlcnRlZCB0byBgZmFzdHFgIGZpbGVzIHVzaW5nCltgYmFtdG9mYXN0cWBdKGh0dHBzOi8vc3VwcG9ydC4xMHhnZW5vbWljcy5jb20vZG9jcy9iYW10b2Zhc3RxKSBmcm9tIDEweCBHZW5vbWljcy4KV2Ugd2lsbCBwcm9jZXNzIGEgYGZhc3RxYCBmaWxlIGZyb20gbW91c2UgYmxhZGRlciBmb3IgdGhpcyBhcyBhbiBleGFtcGxlLiAgClRvIGxpbWl0IHRoZSBhbW91bnQgb2YgdGltZSB0aGlzIHRha2VzIHRvIHJ1biBpbiB0aGUgY29udGV4dCBvZiB0aGlzIHdvcmtzaG9wLAp3ZSBhcmUgb25seSBydW5uaW5nIHBhcnQgb2YgdGhlIHNhbXBsZSdzIHJlYWRzLgoKKk5vdGUqOiBEZXBlbmRpbmcgb24gdGhlIGZvcm1hdCBvZiB0aGUgZGF0YSB5b3UgYXJlIHdvcmtpbmcgd2l0aCwgaS5lLiwgaWYgeW91IGhhdmUgYSBzZXQgb2YgYC5iY2xgIGZpbGVzLCB5b3UgbWF5IG5lZWQgdG8gdXNlIFtgY2VsbHJhbmdlciBta2Zhc3RxYF0oaHR0cHM6Ly9zdXBwb3J0LjEweGdlbm9taWNzLmNvbS9zaW5nbGUtY2VsbC1nZW5lLWV4cHJlc3Npb24vc29mdHdhcmUvcGlwZWxpbmVzL2xhdGVzdC91c2luZy9ta2Zhc3RxKSB0byBjcmVhdGUgYC5mYXN0cWAgZmlsZXMgZm9yIGVhY2ggc2FtcGxlLgpIb3dldmVyLCBtb3N0IHB1YmxpYyBkYXRhIGlzIGF2YWlsYWJsZSBpbiBgZmFzdHFgIGZvcm1hdCwgYW5kIG1vc3Qgc2VxdWVuY2luZyBjb3JlcyB3aWxsIGdlbmVyYXRlIHRoZSBgLmZhc3RxYCBmaWxlcywgc28gdGhhdCBpcyB3aGVyZSB3ZSB3aWxsIHN0YXJ0LgoKCiMjIENoZWNraW5nIGRpcmVjdG9yaWVzIGFuZCBmaWxlcwoKSWYgeW91IGhhdmUgb3BlbmVkIHRoZSBgc2NSTkEtc2VxLlJwcm9qYCBmaWxlLCB5b3VyIFRlcm1pbmFsIHNob3VsZCBhbHJlYWR5IGJlIHNldCB0byB0aGUgYHNjUk5BLXNlcWAgZGlyZWN0b3J5LCBidXQgaXQgaXMgd29ydGggY2hlY2tpbmcgd2l0aCB0aGUgYHB3ZGAgY29tbWFuZCBpbiB0aGUgVGVybWluYWwgCihvciBieSBsb29raW5nIGF0IHRoZSBwYXRoIHNob3duIGluIHRoZSBjb21tYW5kIHByb21wdCBvciBhdCB0aGUgdG9wIG9mIHRoZSBUZXJtaW5hbCBwYW5lKS4gCgpJZiB5b3UgYXJlIGluIGEgZGlmZmVyZW50IGRpcmVjdG9yeSwgd2Ugd2lsbCB3YW50IHRvIHVzZSBgY2RgIHRvIGNoYW5nZSB0byB0aGUgYHRyYWluaW5nLW1vZHVsZXMvc2NSTkEtc2VxYCBkaXJlY3RvcnkuIAoKQ29weSBhbmQgcGFzdGUgdGhlIHRleHQgaW4gdGhlIGNvZGUgYmxvY2tzIGJlbG93IGludG8geW91ciBgVGVybWluYWxgIHdpbmRvdyBpbiBSU3R1ZGlvLgpJdCBzaG91bGQgYmUgaW4gdGhlIGxvd2VyIGxlZnQgaGFuZCBjb3JuZXIgYXMgYSB0YWIgbmV4dCB0byBgQ29uc29sZWAuCgpgYGBiYXNoCmNkIH4vdHJhaW5pbmctbW9kdWxlcy9zY1JOQS1zZXEKYGBgCgpPbmNlIHlvdSBhcmUgdGhlcmUsIHlvdSBzaG91bGQgYmUgYWJsZSB0byBydW4gdGhlIGZvbGxvd2luZyBjb21tYW5kIGluIHRoZSBUZXJtaW5hbCB0byBsb29rIGF0IHRoZSBjb250ZW50cyBvZiB0aGUgYGRhdGEvdGFidWxhLW11cmlzYCBkaXJlY3Rvcnk6CgpgYGBiYXNoCmxzIGRhdGEvdGFidWxhLW11cmlzCmBgYAoKSGVyZSB5b3Ugd2lsbCBzZWUgdGhlIGBmYXN0cWAgZGlyZWN0b3J5LCB3aGljaCBpcyBhY3R1YWxseSBhIGxpbmsgdG8gYSBzaGFyZWQgZm9sZGVyIHdpdGggdGhlIHJhdyBmYXN0cSBmaWxlcywgc3BsaXQgYnkgc2FtcGxlLgpXZSB3aWxsIHVzZSB0aGVzZSBmaWxlcywgYnV0IHdlIHdpbGwgbm90IHdyaXRlIHRvIHRoaXMgZGlyZWN0b3J5LiAKCldlIGNhbiBsb29rIGluc2lkZSB0aGUgY29udGVudHMgb2YgdGhlIGBmYXN0cWAgZGlyZWN0b3J5LCBhbmQgd2Ugc2hvdWxkIHNlZSAxNiBzdWJmb2xkZXJzIGNvcnJlc3BvbmRpbmcgdG8gMTYgZGlmZmVyZW50IHNhbXBsZXMuIApXaXRoaW4gZWFjaCBvZiB0aGVzZSBmb2xkZXJzIHNob3VsZCBiZSB0aGUgYGZhc3RxYCBmaWxlcyBhc3NvY2lhdGVkIHdpdGggdGhhdCBzYW1wbGUuIApBZ2Fpbiwgd2UgY2FuIHVzZSB0aGUgYGxzYCBjb21tYW5kIHRvIHNob3cgdGhlIGNvbnRlbnRzIG9mIGVhY2ggb2YgdGhlIGRpcmVjdG9yaWVzLiAKCkluIHRoaXMgc2NlbmFyaW8sIGAxMFhfUDRfM2AgcmVmZXJzIHRvIHRoZSBzYW1wbGUgbmFtZSB0aGF0IHdlIHdpbGwgYmUgcHJvY2Vzc2luZywgd2hpY2ggY29udGFpbnMgZGF0YSBmcm9tIG1vdXNlIGJsYWRkZXIgY2VsbHMuCgpgYGBiYXNoCmxzIGRhdGEvdGFidWxhLW11cmlzL2Zhc3RxLzEwWF9QNF8zCmBgYAoKWW91IHNob3VsZCBzZWUgYSBsaXN0IG9mIG11bHRpcGxlIGBmYXN0cWAgZmlsZXMgYWxsIHN0YXJ0aW5nIHdpdGggYDEwWF9QNF8zYCwgaW5kaWNhdGluZyB0aGUgc2FtcGxlIG5hbWUuIAoKSWYgeW91IG5vdGljZSwgZWFjaCBgZmFzdHFgIGZpbGUgbmFtZSBjb250YWlucyBlaXRoZXIgYFIxYCBvciBgUjJgLgpUaGVzZSBjb3JyZXNwb25kIHRvIHRoZSB0d28gc2VxdWVuY2luZyByZWFkcyBvZiBhIHBhaXJlZC1lbmQgbGlicmFyeS4KRm9yIDEweCBkYXRhLCB0aGUgZmlyc3QgcmVhZCAodGhlIGBSMWAgZmlsZSkgd2lsbCBjb250YWluIHRoZSBjZWxsIGJhcmNvZGUgYW5kIFVNSSBzZXF1ZW5jZSwgYW5kIHRoZSBzZWNvbmQgcmVhZCAodGhlIGBSMmAgZmlsZSkgd2lsbCBjb250YWluIGEgY0ROQSBzZXF1ZW5jZSBjb3JyZXNwb25kaW5nIHRvIHRoZSBjYXB0dXJlZCB0cmFuc2NyaXB0LgpXZSB3aWxsIG5lZWQgYm90aCBvZiB0aGVzZSBmaWxlcyB0byBxdWFudGlmeSBvdXIgZGF0YS4gCgpgYGAKMTBYX1A0XzNfTDAwMV9SMV8wMDEuZmFzdHEuZ3ogIDEwWF9QNF8zX0wwMDJfUjFfMDAxLmZhc3RxLmd6CjEwWF9QNF8zX0wwMDFfUjFfMDAyLmZhc3RxLmd6ICAxMFhfUDRfM19MMDAyX1IxXzAwMi5mYXN0cS5negoxMFhfUDRfM19MMDAxX1IxXzAwMy5mYXN0cS5neiAgMTBYX1A0XzNfTDAwMl9SMV8wMDMuZmFzdHEuZ3oKMTBYX1A0XzNfTDAwMV9SMl8wMDEuZmFzdHEuZ3ogIDEwWF9QNF8zX0wwMDJfUjJfMDAxLmZhc3RxLmd6CjEwWF9QNF8zX0wwMDFfUjJfMDAyLmZhc3RxLmd6ICAxMFhfUDRfM19MMDAyX1IyXzAwMi5mYXN0cS5negoxMFhfUDRfM19MMDAxX1IyXzAwMy5mYXN0cS5neiAgMTBYX1A0XzNfTDAwMl9SMl8wMDMuZmFzdHEuZ3oKYGBgCgpTZXF1ZW5jaW5nIHJ1bnMgYXJlIG9mdGVuIHNwbGl0IGludG8gbXVsdGlwbGUgYGZhc3RxYCBmaWxlcywgYm90aCB3aGVuIGEgc2FtcGxlIHdhcyBydW4gYWNyb3NzIG11bHRpcGxlIGxhbmVzIGFuZCB0byBrZWVwIHRoZSBpbmRpdmlkdWFsIGZpbGUgc2l6ZXMgZG93bi4gClRoaXMgd2FzIHRoZSBjYXNlIGZvciB0aGUgKlRhYnVsYSBNdXJpcyogZGF0YSB3ZSBhcmUgdXNpbmcgaGVyZSBhcyB3ZWxsLiAKVGhlIGZpbGVzIHRoYXQgeW91IHNlZSBpbiB0aGUgYGRhdGEvdGFidWxhLW11cmlzL2Zhc3RxLzEwWF9QNF8zYCBkaXJlY3Rvcnkgc2hvd24gYWJvdmUgcmVwcmVzZW50IDIgbGFuZXMgd29ydGggb2YgZGF0YSwgd2l0aCB0aHJlZSBSMSBhbmQgdGhyZWUgUjIgZmlsZXMgcGVyIGxhbmUuCgpZb3Ugd2lsbCBhbHNvIHNlZSB0aGUgZmlsZSBgVE1fZHJvcGxldF9tZXRhZGF0YS5jc3ZgLCB3aGljaCBjb250YWlucyBtZXRhZGF0YSBmb3IgdGhlICpUYWJ1bGEgTXVyaXMqIGV4cGVyaW1lbnRzLgoKIyMjIFNldCB1cCBvdXRwdXQgZGlyZWN0b3J5CgpOb3cgdGhhdCB3ZSBhcmUgaW4gYHNjUk5BLXNlcWAsIHdlJ2xsIG1ha2UgYSBkaXJlY3RvcnkgZm9yIHVzIHRvIHN0b3JlIG91ciBxdWFudGlmaWNhdGlvbiBmaWxlcyBpbi4KSW4gYFRlcm1pbmFsYCwgcnVuIHRoZSBmb2xsb3dpbmcgY29tbWFuZDoKCmBgYGJhc2gKbWtkaXIgLXAgZGF0YS90YWJ1bGEtbXVyaXMvYWxldmluLXF1YW50LzEwWF9QNF8zX3N1YnNldApgYGAKCiMjIFF1YW50aWZ5aW5nIGNlbGwgZXhwcmVzc2lvbiB3aXRoIFNhbG1vbiBBbGV2aW4KCltBbGV2aW5dKGh0dHBzOi8vZ2Vub21lYmlvbG9neS5iaW9tZWRjZW50cmFsLmNvbS9hcnRpY2xlcy8xMC4xMTg2L3MxMzA1OS0wMTktMTY3MC15KSBpcyBydW4gZnJvbSB0aGUgY29tbWFuZCBsaW5lIChUZXJtaW5hbCkgdG8gcGVyZm9ybSBtYXBwaW5nIGFuZCBxdWFudGlmaWNhdGlvbiBvZiB0YWctYmFzZWQgc2luZ2xlIGNlbGwgZXhwcmVzc2lvbiBkYXRhLiAKCiMjIyBJbmRleGluZyB0aGUgbW91c2UgdHJhbnNjcmlwdG9tZSAKCkJlZm9yZSB5b3UgY2FuIHF1YW50aWZ5IHdpdGggU2FsbW9uIGFuZCBbQWxldmluXShodHRwczovL2dlbm9tZWJpb2xvZ3kuYmlvbWVkY2VudHJhbC5jb20vYXJ0aWNsZXMvMTAuMTE4Ni9zMTMwNTktMDE5LTE2NzAteSkgd2UgbmVlZCB0byBpbmRleCB0aGUgdHJhbnNjcmlwdG9tZSBmb3IgdGhlIHNwZWNpZXMgd2Ugd2lsbCBiZSBtYXBwaW5nIHRvLgpUaGlzIHN0ZXAgd291bGQgYmUgdGhlIHNhbWUgZm9yIG1hcHBpbmcgYnVsayBSTkEtc2VxIGRhdGEsIGFuZCB5b3UgY2FuIHVzZSB0aGUgc2FtZSB0cmFuc2NyaXB0b21lIGluZGV4ZXMgYXMgYnVsayBSTkEtc2VxLCBob3dldmVyLCBkdWUgdG8gdGhlIHNob3J0ZXIgcmVhZCBsZW5ndGhzIGluIHRoZSAxMHggc2VxdWVuY2luZywgd2UgbWF5IHdhbnQgdG8gdXNlIHNob3J0ZXIga21lcnMgdGhhbiB0aGUgZGVmYXVsdCBpbmRleCBzaXplIHRoYXQgc2FsbW9uIHVzZXMuCkluIHRoaXMgaW5zdGFuY2UsIHdlIHVzZWQgYSBgLWtgIG9mIDIzLgoKSW4gdGhlIGludGVyZXN0IG9mIHRpbWUsIHdlIGhhdmUgYWxyZWFkeSBydW4gdGhlIGNvbW1hbmQgYmVsb3cgYW5kIGhhdmUgdGhlIGluZGV4IGJ1aWx0IGFuZCByZWFkeSBmb3IgeW91IGluIGEgc2hhcmVkIGRpcmVjdG9yeS4KCkJ1dCBmb3IgeW91ciBvd24gcmVmZXJlbmNlLCBoZXJlIGlzIGhvdyB5b3UgbWlnaHQgZG8gaXQgeW91cnNlbGY6CmBgYAojIHNhbG1vbiAtLXRocmVhZHM9MTYgLS1uby12ZXJzaW9uLWNoZWNrIGluZGV4IFwKIyAgLXQgTXVzX211c2N1bHVzLkdSQ20zOC5jZG5hLmFsbC5mYS5neiBcCiMgIC1pIGluZGV4L011c19tdXNjdWx1cy9zaG9ydF9pbmRleCBcCiMgIC1rIDIzCmBgYAoKU2NyaXB0cyB0byBidWlsZCB0aGUgaW5kZXhlcyBsaWtlIHRob3NlIHdlIGFyZSB1c2luZyBoZXJlIChhbmQgb3RoZXJzKSBjYW4gYmUgZm91bmQgaW4gW3RoaXMgcmVwb3NpdG9yeV0oaHR0cHM6Ly9naXRodWIuY29tL0FsZXhzTGVtb25hZGUvdHJhaW5pbmctdHhvbWUtcHJlcCkuIAoKCiMjIyBSdW5uaW5nIFNhbG1vbiBBbGV2aW4KCkNvcHkgYW5kIHBhc3RlIHRoaXMgaW4geW91ciBgVGVybWluYWxgIHRvIHJ1biB0aGUgQWxldmluIHF1YW50aWZpY2F0aW9uLgpUaGlzIHdpbGwgdGFrZSBhYm91dCAyMCBtaW51dGVzIHRvIHJ1biwgc28gd2Ugd2lsbCBzdGFydCBub3csIHRoZW4gdGFsayBhYm91dCB0aGUgb3B0aW9ucy4KCk5vdGUgdGhhdCBoZXJlIHdlIGFyZSBvbmx5IGdpdmluZyB0aGUgZnVsbCBwYXRocyB0byBvbmUgb2YgdGhlIGBSMWAgZmlsZXMgYW5kIG9uZSBvZiB0aGUgYFIyYCBmaWxlcy4gCkZvciB0aGUgc2FrZSBvZiB0aW1lLCB3ZSBhcmUgb25seSBnb2luZyB0byBiZSBydW5uaW5nIHRoaXMgb24gYSBzdWJzZXQgb2YgcmVhZHMsIGJ1dCB3aWxsIGFsc28gc2hvdyB5b3UgaG93IHRvIHJ1biBpdCBvbiB0aGUgZnVsbCBzYW1wbGUuICAKCmBgYGJhc2gKc2FsbW9uIGFsZXZpbiBcCiAgLWkgaW5kZXgvTXVzX211c2N1bHVzL3Nob3J0X2luZGV4IFwKICAtbCBJU1IgXAogIC0xIGRhdGEvdGFidWxhLW11cmlzL2Zhc3RxLzEwWF9QNF8zLzEwWF9QNF8zX0wwMDFfUjFfMDAxLmZhc3RxLmd6IFwKICAtMiBkYXRhL3RhYnVsYS1tdXJpcy9mYXN0cS8xMFhfUDRfMy8xMFhfUDRfM19MMDAxX1IyXzAwMS5mYXN0cS5neiBcCiAgLW8gZGF0YS90YWJ1bGEtbXVyaXMvYWxldmluLXF1YW50LzEwWF9QNF8zX3N1YnNldCBcCiAgLXAgNCBcCiAgLS1jaHJvbWl1bSAgXAogIC0tdGdNYXAgaW5kZXgvTXVzX211c2N1bHVzL011c19tdXNjdWx1cy5HUkNtMzguOTUudmVyc2lvbmVkX3R4MmdlbmUudHN2IFwKICAtLWR1bXBGZWF0dXJlcwpgYGAKCiMjIyBTYWxtb24gQWxldmluIGNvbW1hbmQgbGluZSBvcHRpb25zCgpGb3IgZGV0YWlsZWQgaW5mb3JtYXRpb24gYWJvdXQgYWxsIG9wdGlvbnMgYXZhaWxhYmxlIHNlZSB0aGUgW0FsZXZpbiBkb2N1bWVudGF0aW9uXShodHRwczovL3NhbG1vbi5yZWFkdGhlZG9jcy5pby9lbi9sYXRlc3QvYWxldmluLmh0bWwpIGFuZCBbU2FsbW9uIGRvY3VtZW50YXRpb25dKGh0dHBzOi8vc2FsbW9uLnJlYWR0aGVkb2NzLmlvL2VuL2xhdGVzdC9zYWxtb24uaHRtbCkuCgpNYW55IG9mIHRoZSBvcHRpb25zIGZvciB0aGUgYHNhbG1vbiBhbGV2aW5gIGNvbW1hbmQgYXJlIHRoZSBzYW1lIGFzIHRob3NlIHlvdSB3b3VsZCBzZWUgd2hlbiBtYXBwaW5nIGFuZCBxdWFudGlmeWluZyBidWxrIFJOQS1zZXEgZGF0YSB3aXRoIGBzYWxtb24gcXVhbnRgOgoKLSBgLWlgIGdpdmVzIHRoZSBsb2NhdGlvbiBvZiB0aGUgdHJhbnNjcmlwdG9tZSBpbmRleAotIGAtMWAgYW5kIGAtMmAgYXJlIHRoZSBwYWlyZWQgcmVhZCBpbnB1dCBmaWxlcwotIGAtb2AgZGVzaWduYXRlcyB0aGUgb3V0cHV0IGZvbGRlcgotIGAtcGAgYWxsb3dzIHVzIHRvIHNwZWNpZnkgaG93IG1hbnkgcHJvY2Vzc29ycyB0byB1c2U7IGluIHRoaXMgY2FzZSB3ZSB3aWxsIHVzZSA0CgoKIyMjIyBgLWxgClRoZSBgLWxgIG9wdGlvbiBpcyBmb3IgZGVzaWduYXRpbmcgdGhlIGxpYnJhcnkgZm9ybWF0LiAKRm9yIG1vc3Qgc2luZ2xlLWNlbGwgcXVhbnRpZmljYXRpb24sIHlvdSB3aWxsIHdhbnQgdG8gdXNlIHRoZSBgSVNSYCBsaWJyYXJ5IHR5cGUuClNlZSBbU2FsbW9uJ3MgZG9jdW1lbnRhdGlvbl0oaHR0cHM6Ly9zYWxtb24ucmVhZHRoZWRvY3MuaW8vZW4vbGF0ZXN0L2xpYnJhcnlfdHlwZS5odG1sI2ZyYWdsaWJ0eXBlKSBmb3IgbW9yZSBpbmZvcm1hdGlvbiBvbiBmcmFnbWVudCBsaWJyYXJ5IHR5cGVzIChhbmQgYWxsIG9mIHRoZSBvdGhlciBvcHRpb25zIGF2YWlsYWJsZSkuCk5vdGUgdGhhdCB0aGlzIG9wdGlvbiBtdXN0IGNvbWUgKmJlZm9yZSogdGhlIHJlYWQgZmlsZXMuCgojIyMjIGAtLWNocm9taXVtYApCZWNhdXNlIHdlIGFyZSB1c2luZyAxMHggdjIgY2hyb21pdW0gZGF0YSwgd2UgaGF2ZSB0byB1c2UgdGhpcyBmbGFnIHRvIHRlbGwgYGFsZXZpbmAgd2hlcmUgdG8gZXhwZWN0IHRoZSBiYXJjb2RlcywgVU1JcyBhbmQgc2VxdWVuY2UgZGF0YS4KSWYgd2Ugd2VyZSB1c2luZyAxMHggdjMgZGF0YSwgd2Ugd291bGQgbmVlZCB0aGUgYC0tY2hyb21pdW1WM2AgZmxhZyBpbnN0ZWFkLiAKRHJvcC1zZXEgZGF0YSBpcyBhbHNvIHN1cHBvcnRlZCwgZm9yIHdoaWNoIHdlIHdvdWxkIHVzZSB0aGUgYC0tZHJvcHNlcWAgZmxhZyBpbnN0ZWFkIG9mIHRoaXMuCgoKIyMjIyBgLS10Z01hcGAKVGhlIHRyYW5zY3JpcHRvbWUgZmlsZSB0aGF0IHdlIGFyZSBtYXBwaW5nIHRvIGhhcyBzZXBhcmF0ZSBzZXF1ZW5jZXMgZm9yIGVhY2ggdHJhbnNjcmlwdCBvZiBhIGdlbmUsIGJ1dCBkdWUgdG8gdGhlIHNwYXJzZSBuYXR1cmUgb2Ygc2luZ2xlLWNlbGwgZGF0YSwgd2UgYXJlIG5vdCBsaWtlbHkgdG8gYmUgYWJsZSB0byBtZWFuaW5nZnVsbHkgZGlzdGluZ3Vpc2ggYW1vbmcgZGlmZmVyZW50IHRyYW5zY3JpcHRzLgpGb3IgdGhpcyByZWFzb24sIGBhbGV2aW5gIHdpbGwgcXVhbnRpZnkgb3VyIHJlc3VsdHMgYXQgdGhlIGdlbmUgbGV2ZWwsIHNvIHdlIG5lZWQgdG8gcHJvdmlkZSBhIGZpbGUgdGhhdCBtYXBzIGVhY2ggdHJhbnNjcmlwdCB0byBpdHMgZ2VuZS4KRm9yIHRoaXMgZXhhbXBsZSwgd2UndmUgcHJlLW1hZGUgdGhlIGZpbGUgYE11c19tdXNjdWx1cy5HUkNtMzguOTUudmVyc2lvbmVkX3R4MmdlbmUudHN2YCBmcm9tIHRoZSBFbnNlbWJsIHRyYW5zY3JpcHRvbWUgdGhhdCB3ZSBpbmRleGVkIGFib3ZlLiAKVGhlIGZpbGUgaXMgYSBUU1YgKHRhYi1zZXBhcmF0ZWQgdmFsdWVzKSBmaWxlIHdpdGggMiBjb2x1bW5zOiBvbmUgb2YgdHJhbnNjcmlwdHMgYW5kIHRoZSBvdGhlciB0aGUgZ2VuZSB0aGF0IGVhY2ggY29tZXMgZnJvbS4KCgojIyMjIGAtLWR1bXBGZWF0dXJlc2AKVGhpcyBvcHRpb24gd2lsbCBwcmludCBvdXQgaW5mb3JtYXRpb24gdGhhdCB3ZSB3aWxsIHVzZSBmb3IgcXVhbGl0eSBjaGVja3MgbGF0ZXIgb24sIGluY2x1ZGluZyBmaWxlcyB3aXRoIGluZm9ybWF0aW9uIG9uIHRoZSBVTUlzIGFuZCBjZWxsIGJhcmNvZGVzLgoKU2VlIHRoZSBbQWxldmluIGRvY3VtZW50YXRpb25dKGh0dHBzOi8vc2FsbW9uLnJlYWR0aGVkb2NzLmlvL2VuL2xhdGVzdC9hbGV2aW4uaHRtbCkgZm9yIGEgY29tcGxldGUgbGlzdCBvZiB0aGUgQWxldmluIG9wdGlvbnMuClRoZXJlIGFyZSBhbHNvIGEgbnVtYmVyIG9mIGV4YW1wbGUgYW5hbHlzZXMgYXQgdGhlIFtBbGV2aW4gdHV0b3JpYWxdKGh0dHBzOi8vY29tYmluZS1sYWIuZ2l0aHViLmlvL2FsZXZpbi10dXRvcmlhbC8pIHdlYnNpdGUuCgoKIyMjIE5vdGU6IFJ1bm5pbmcgdGhlIEZVTEwgc2FtcGxlLgoKV2hlbiB3ZSB0b29rIGEgbG9vayBhdCB0aGUgYGRhdGEvdGFidWxhLW11cmlzL2Zhc3RxLzEwWF9QNF8zYCBkaXJlY3RvcnkgZWFybGllciwgd2Ugbm90aWNlZCB0aGF0IHRoZXJlIHdlcmUgbXVsdGlwbGUgZmlsZXMgcmVwcmVzZW50aW5nIDIgbGFuZXMgd29ydGggb2YgZGF0YSwgd2l0aCB0aHJlZSBSMSBhbmQgdGhyZWUgUjIgZmlsZXMgcGVyIGxhbmU6CgpXZSBzaG91bGQgcmVhbGx5IHJ1biBhbGwgb2YgdGhlc2UgdGhyb3VnaCBTYWxtb24sIHRob3VnaCB0aGF0IHdpbGwgdGFrZSBhYm91dCBzaXggdGltZXMgYXMgbG9uZyBhcyB0aGUgc2luZ2xlIHBhaXIgb2YgcmVhZHMgd2UgdXNlZC4gClRvIGRvIHRoaXMsIHdlIGNvdWxkIGxpc3QgZWFjaCBSMSBhbmQgUjIgZmlsZSAoc3BhY2Ugc2VwYXJhdGVkKSBhZnRlciB0aGUgYC0xYCBhbmQgYC0yYCBhcmd1bWVudHMsIHJlc3BlY3RpdmVseS4KQnV0IHRoYXQgaXMgYSBsb3Qgb2YgdHlwaW5nLCBzbyBhIG5pY2Ugc2hvcnRjdXQgaXMgdG8gdXNlIGEgYCpgIGNoYXJhY3RlciB0byByZXByZXNlbnQgYSB3aWxkY2FyZCB0aGF0IHdpbGwgYmUgZmlsbGVkIGluIHdpdGggd2hhdGV2ZXIgY2hhcmFjdGVycyBhcmUgcHJlc2VudCBpbiB0aGUgZmlsZXMgYXQgdGhlIGdpdmVuIHBhdGguCkluIHRoZSBwYXR0ZXJuIGAxMFhfUDRfM19MKl9SMV8qLmZhc3RxLmd6YCwgdGhpcyB3b3VsZCBhbGxvdyBhbnkgbGFuZSBudW1iZXIgYW5kIGFueSBzdWJzZXQsIHNvIHdvdWxkIG1hdGNoIGFsbCBvZiB0aGUgZm9sbG93aW5nIGZpbGVzIChhbGwgdGhlIFIxIGZpbGVzIGluIHRoaXMgY2FzZSk6CgpgYGAKMTBYX1A0XzNfTDAwMV9SMV8wMDEuZmFzdHEuZ3ogIDEwWF9QNF8zX0wwMDJfUjFfMDAxLmZhc3RxLmd6CjEwWF9QNF8zX0wwMDFfUjFfMDAyLmZhc3RxLmd6ICAxMFhfUDRfM19MMDAyX1IxXzAwMi5mYXN0cS5negoxMFhfUDRfM19MMDAxX1IxXzAwMy5mYXN0cS5neiAgMTBYX1A0XzNfTDAwMl9SMV8wMDMuZmFzdHEuZ3oKYGBgCgpGb3IgdGhpcyBkaXJlY3RvcnksIHRoYXQgd291bGQgbWFrZSBvdXIgZnVsbCBgc2FsbW9uIGFsZXZpbmAgY29tbWFuZCBsb29rIGxpa2UgdGhpcyAoZG9uJ3QgcnVuIHRoaXMgbm93ISk6CgpgYGAKIyBzYWxtb24gYWxldmluIFwKIyAgIC1pIGluZGV4L011c19tdXNjdWx1cy9zaG9ydF9pbmRleCBcCiMgICAtbCBJU1IgXAojICAgLTEgZGF0YS90YWJ1bGEtbXVyaXMvZmFzdHEvMTBYX1A0XzMvMTBYX1A0XzNfTCpfUjFfKi5mYXN0cS5neiBcCiMgICAtMiBkYXRhL3RhYnVsYS1tdXJpcy9mYXN0cS8xMFhfUDRfMy8xMFhfUDRfM19MKl9SMl8qLmZhc3RxLmd6IFwKIyAgIC1vIGRhdGEvdGFidWxhLW11cmlzL2FsZXZpbi1xdWFudC8xMFhfUDRfMyBcCiMgICAtcCA0IFwKIyAgIC0tY2hyb21pdW0gIFwKIyAgIC0tdGdNYXAgaW5kZXgvTXVzX211c2N1bHVzL011c19tdXNjdWx1cy5HUkNtMzguOTUudmVyc2lvbmVkX3R4MmdlbmUudHN2IFwKIyAgIC0tZHVtcEZlYXR1cmVzCmBgYAoKSW4gZ2VuZXJhbCwgeW91IHdpbGwgd2FudCB0byBydW4gYWxsIGxhbmVzIGFuZCBhbGwgZmlsZXMgZm9yIGEgZ2l2ZW4gc2FtcGxlIHRvZ2V0aGVyLgpCdXQgKipETyBOT1QqKiBjb21iaW5lIG11bHRpcGxlICpzYW1wbGVzKiBpbnRvIGEgc2luZ2xlIGBhbGV2aW5gIHF1YW50aWZpY2F0aW9uIQpLZWVwIHNlcGFyYXRlIHNhbXBsZXMgKGFuZCByZXBsaWNhdGVzKSBzZXBhcmF0ZSEKCiMjIEluaXRpYWwgcXVhbGl0eSBjb250cm9sIHdpdGggYGFsZXZpblFDYAoKTm93IHRoYXQgd2UgaGF2ZSBxdWFudGlmaWVkIG91ciBkYXRhIHdpdGggQWxldmluLCB3ZSBhcmUgcmVhZHkgdG8gcGVyZm9ybSBpbml0aWFsIHF1YWxpdHkgY29udHJvbCBjaGVja3MuCgpJbiBvcmRlciB0byBwZXJmb3JtIHRoZXNlIHF1YWxpdHkgY29udHJvbCBjaGVja3MsIHdlJ2xsIHVzZSB0aGUgYGFsZXZpblFDYCBSIHBhY2thZ2UuCk5vdGUgdGhhdCBgYWxldmluUUNgIGRlcGVuZHMgb24gZmlsZXMgdGhhdCB3ZSBnZXQgdXNpbmcgdGhlYC0tZHVtcEZlYXR1cmVzYCBvcHRpb24gaW4gQWxldmluLgoKQWJvdXQgdGhlIGBhbGV2aW5RQ1JlcG9ydCgpYCBmdW5jdGlvbjoKVGhlIGZpcnN0IGFyZ3VtZW50IG5lZWRzIHRvIGJlIHdoZXJlIHRoZSBzYW1wbGUncyBvdXRwdXQgZGF0YSB3YXMgcHV0IHdoZW4gQWxldmluIHdhcyBydW4gKGFzIGEgY2hhcmFjdGVyIHN0cmluZywgYWthIHVzaW5nIHF1b3RlcykuClRoZSByZXN0IG9mIGBhbGV2aW5RQ1JlcG9ydCgpYCdzIGFyZ3VtZW50cyB0ZWxsIFIgd2hlcmUgdG8gcHV0IHRoZSBvdXRwdXQgUUMgcmVwb3J0LgoKYGBge3IgYWxldmluUUMsIGV2YWwgPSBGQUxTRX0KIyBGaXJzdCwgZGVmaW5lIHBhdGggdG8gYWxldmluIG91dHB1dDoKYWxldmluX3BhdGggPC0gZmlsZS5wYXRoKCJkYXRhIiwgInRhYnVsYS1tdXJpcyIsICJhbGV2aW4tcXVhbnQiLCAiMTBYX1A0XzNfc3Vic2V0IikKCiMgUHJvZHVjZSBhIFFDIHJlcG9ydCBvZiByZXN1bHRzIGZvdW5kIGluIHRoZSBgYWxldmluX3BhdGhgIGRpcmVjdG9yeQphbGV2aW5RQzo6YWxldmluUUNSZXBvcnQoYWxldmluX3BhdGgsCiAgICAgICAgICAgICAgICAgICAgICAgICBzYW1wbGVJZCA9ICIxMFhfUDRfM19zdWJzZXQiLAogICAgICAgICAgICAgICAgICAgICAgICAgb3V0cHV0RmlsZSA9ICIxMFhfUDRfM19zdWJzZXQtcWNfcmVwb3J0Lmh0bWwiLAogICAgICAgICAgICAgICAgICAgICAgICAgb3V0cHV0RGlyID0gInFjLXJlcG9ydHMiKQpgYGAKCkxvb2sgZm9yIHRoZSBgMTBYX1A0XzNfc3Vic2V0LXFjX3JlcG9ydC5odG1sYCBmaWxlIGNyZWF0ZWQgaW4gdGhlIGBxYy1yZXBvcnRzYCBkaXJlY3RvcnkgdG8gZXhhbWluZSB0aGUgcXVhbGl0eSBvZiB5b3VyIGRhdGEgYW5kIHBlcmZvcm1hbmNlIG9mIEFsZXZpbi4KCldlIGhhdmUgYWxzbyBwbGFjZWQgYW4gZXhhbXBsZSBvZiBhIHBvb3IgcXVhbGl0eSBzYW1wbGUgYWxldmluUUMgcmVwb3J0IGluIHRoZSBgcWMtcmVwb3J0c2AgZGlyZWN0b3J5LCB3aXRoIHRoZSBuYW1lIGBCYWRfRXhhbXBsZV8xMFhfUDRfMl9xY19yZXBvcnQuaHRtbGAuCgpUaGlzIHJlcG9ydCB3aWxsIHNob3cgYSBmZXcga2V5IG1ldHJpY3MgdGhhdCBpbmZvcm0geW91IGFib3V0IHRoZSBxdWFsaXR5IG9mIHlvdXIgc2FtcGxlLiAKVGhlcmUgaXMgYSBsb3Qgb2YgaW5mb3JtYXRpb24gaW5jbHVkZWQgaW4gdGhlIHJlcG9ydCwgc28gc29tZSBrZXkgbWV0cmljcyB0byBub3RlIGFyZSBpbmNsdWRlZCBpbiB0aGUgYFN1bW1hcnkgdGFibGVzYDoKCi0gRnJhY3Rpb24gb2YgcmVhZHMgaW4gd2hpdGVsaXN0IGJhcmNvZGVzIAotIE1lYW4gbnVtYmVyIG9mIHJlYWRzIHBlciBjZWxsCi0gTWVkaWFuIG51bWJlciBvZiBkZXRlY3RlZCBnZW5lcyBwZXIgY2VsbCAKClRoZSBmcmFjdGlvbiBvZiByZWFkcyBpbiB3aGl0ZWxpc3QgYmFyY29kZXMgaXMgcGFydGljdWxhcmx5IGltcG9ydGFudCBhcyBhIGxvdyBwZXJjZW50YWdlIGhlcmUgbWVhbnMgdGhlIGxpYnJhcnkgY29udGFpbnMgbWFueSByZWFkcyB0aGF0IGRvIG5vdCBjb250YWluIHRoZSBleHBlY3RlZCBjZWxsIGJhcmNvZGVzLiAKVGhpcyBpcyBpbmRpY2F0aXZlIG9mIHBvb3Igc2luZ2xlLWNlbGwgY2FwdHVyZSBkdXJpbmcgbGlicmFyeSBjb25zdHJ1Y3Rpb24uIAoKVGhlIG1lYW4gbnVtYmVyIG9mIHJlYWRzIHBlciBjZWxsIGFuZCBtZWRpYW4gbnVtYmVyIG9mIGRldGVjdGVkIGdlbmVzIHBlciBjZWxsIGNhbiBiZSBoZWxwZnVsIGluIHVuZGVyc3RhbmRpbmcgaG93IGRlZXBseSB0aGUgbGlicmFyeSB3YXMgc2VxdWVuY2VkLiAKVGhlIGhpZ2hlciB0aGVzZSBudW1iZXJzIGFyZSwgdGhlIG1vcmUgaW5mb3JtYXRpb24geW91IHdpbGwgb2J0YWluIHBlciBjZWxsLiAKClRoZSAqKmtuZWUgcGxvdCoqIHNob3dzIHRoZSBudW1iZXIgb2YgZGlzdGluY3QgVU1JcyBmb3IgZWFjaCBwb3NzaWJsZSBjZWxsIGJhcmNvZGUgb24gdGhlIHktYXhpcywgd2l0aCB0aGUgYmFyY29kZXMgcmFua2VkIGZyb20gdGhlIG1vc3QgVU1JcyB0byB0aGUgZmV3ZXN0IGFsb25nIHRoZSB4LWF4aXMuCgpDZWxsIGJhcmNvZGVzIHdpdGggbG93IFVNSSBjb3VudHMgYXJlIGxpa2VseSB0byBiZSBlbXB0eSBkcm9wbGV0cyB0aGF0IGRpZCBub3QgY29udGFpbiBhIGNlbGwuIApUaGVzZSBkcm9wbGV0cyBtdXN0IGJlIGZpbHRlcmVkIG91dCBzbyB3ZSBvbmx5IGNvbnNpZGVyIHRydWUgY2VsbHMgZm9yIGRvd25zdHJlYW0gYW5hbHlzaXMuCgpUbyBkbyB0aGlzLCB3ZSBjYW4gbG9vayBmb3IgYSAia25lZSIgb24gdGhlIGN1cnZlIHdoZXJlIHRoZSBudW1iZXIgb2YgVU1JcyBwZXIgYmFyY29kZSBzdGFydHMgdG8gZHJvcCBvZmYgcmFwaWRseSwgd2l0aCB0aGUgaW50dWl0aW9uIHRoYXQgdGhpcyBpcyB3aGVyZSB3ZSBhcmUgcmVhY2hpbmcgdGhlIGVuZCBvZiB0aGUgVU1JcyBwZXIgY2VsbCBkaXN0cmlidXRpb24gZm9yIHRydWUgY2VsbHMuCldlIGNhbiB0aGVuIGNob29zZSBhIHRocmVzaG9sZCBiZWxvdyB0aGUga25lZSBhbmQgb25seSBpbmNsdWRlIGJhcmNvZGVzIGFib3ZlIHRoaXMgdGhyZXNob2xkIGluIHRoZSBmaW5hbCBjZWxsIGJhcmNvZGUgbGlzdC4KClRoaXMgImtuZWUiIG1ldGhvZCwgd2hpY2ggaXMgaW1wbGVtZW50ZWQgYnkgYGFsZXZpbmAsIGlzIGZhaXJseSBlZmZlY3RpdmUgYW5kIGRvZXMgbm90IHJlcXVpcmUgYW55IHJlYWQgbWFwcGluZyBvciBxdWFudGlmaWNhdGlvbiBiZWZvcmUgZmlsdGVyaW5nLgpNb3JlIHJlY2VudCB2ZXJzaW9ucyBvZiBDZWxsIFJhbmdlciB1c2UgYSBzb21ld2hhdCBkaWZmZXJlbnQgbWV0aG9kIGJhc2VkIG9uIHRoZSAiZW1wdHkgZHJvcHMiIG1ldGhvZCBvZiBbTHVuICpldCBhbC4qICgyMDE5KV0oaHR0cHM6Ly9kb2kub3JnLzEwLjExODYvczEzMDU5LTAxOS0xNjYyLXkpLCB0aGF0IGlzIGFwcGxpZWQgYWZ0ZXIgaW5pdGlhbCBnZW5lIHF1YW50aWZpY2F0aW9uLgpUaGlzIGFsbG93cyBmaWx0ZXJpbmcgdG8gcmV0YWluIGNlbGxzIHdpdGggbG93IGNvdW50cyB0aGF0IGFyZSBub25ldGhlbGVzcyBsaWtlbHkgdG8gcmVwcmVzZW50IHJlYWwgY2VsbHMuCgojIyBOZXh0IHN0ZXBzOiBMb2FkaW5nIEFsZXZpbiBvdXRwdXQgaW50byBSCgpBZnRlciB3ZSBoYXZlIHN1Y2Nlc3NmdWxseSBxdWFudGlmaWVkIG91ciB0YWctYmFzZWQgc2NSTkEtc2VxIGRhdGEgKGFuZCBkb25lIHNvbWUgUUMpLCB3ZSB3aWxsIHdhbnQgdG8gcmVhZCBpdCBpbnRvIFIgdG8gc3RhcnQgdG8gYW5hbHl6ZSBpdC4gClRoZSBlYXNpZXN0IHdheSB0byBkbyB0aGlzIGlzIHRvIHVzZSB0aGUgYHR4aW1ldGFgIHBhY2thZ2UsIHdoaWNoIHdlIHdpbGwgaW50cm9kdWNlIGluIHRoZSBuZXh0IG5vdGVib29rLgoKCiMjIFNlc3Npb24gSW5mbwoKYGBge3Igc2Vzc2lvbmluZm99CnNlc3Npb25JbmZvKCkKYGBgCg==
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/02-filtering_scRNA.nb.html b/completed-notebooks/scRNA-seq/02-filtering_scRNA.nb.html new file mode 100644 index 0000000..93b7993 --- /dev/null +++ b/completed-notebooks/scRNA-seq/02-filtering_scRNA.nb.html @@ -0,0 +1,4065 @@ + + + + + + + + + + + + + + + +Single cell RNA-seq quality control and filtering + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Import alevin results with tximeta
  • +
  • Calculate and examine cell quality measures
  • +
+
+

We will continue with the Tabula Muris data set that we started with +in the previous notebook.

+
+Roadmap: Preprocessing and Import +
Roadmap: Preprocessing and Import
+
+
+
+

Set Up

+ + + +
# tximeta for importing alevin results
+library(tximeta)
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
+ + +
# SingleCellExperiment package for organizing our results
+library(SingleCellExperiment)
+ + +
Loading required package: SummarizedExperiment
+ + +
Loading required package: MatrixGenerics
+ + +
Loading required package: matrixStats
+ + +

+Attaching package: 'MatrixGenerics'
+ + +
The following objects are masked from 'package:matrixStats':
+
+    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
+    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
+    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
+    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
+    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
+    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
+    colWeightedMeans, colWeightedMedians, colWeightedSds,
+    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
+    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
+    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
+    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
+    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
+    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
+    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
+    rowWeightedSds, rowWeightedVars
+ + +
Loading required package: GenomicRanges
+ + +
Loading required package: stats4
+ + +
Loading required package: BiocGenerics
+ + +

+Attaching package: 'BiocGenerics'
+ + +
The following objects are masked from 'package:stats':
+
+    IQR, mad, sd, var, xtabs
+ + +
The following objects are masked from 'package:base':
+
+    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
+    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
+    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
+    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
+    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
+    tapply, union, unique, unsplit, which.max, which.min
+ + +
Loading required package: S4Vectors
+ + +

+Attaching package: 'S4Vectors'
+ + +
The following object is masked from 'package:utils':
+
+    findMatches
+ + +
The following objects are masked from 'package:base':
+
+    expand.grid, I, unname
+ + +
Loading required package: IRanges
+ + +
Loading required package: GenomeInfoDb
+ + +
Loading required package: Biobase
+ + +
Welcome to Bioconductor
+
+    Vignettes contain introductory material; view with
+    'browseVignettes()'. To cite Bioconductor, see
+    'citation("Biobase")', and for packages 'citation("pkgname")'.
+ + +

+Attaching package: 'Biobase'
+ + +
The following object is masked from 'package:MatrixGenerics':
+
+    rowMedians
+ + +
The following objects are masked from 'package:matrixStats':
+
+    anyMissing, rowMedians
+ + +
# GGPlot2 for the plots
+library(ggplot2)
+ + + +
+
+

Import single-cell RNA-seq quantification

+
+

Directories and files

+

The data files we will be using for this part of the project are in +the data/tabula-muris subdirectory of the +scRNA-seq directory where this notebook is located.

+

The main files we will be using at this stage are the results from +our earlier quantification, located in the alevin-quant +subdirectory. Rather than just the subset, we will use the full data in +order to get a somewhat more realistic view of a 10x data set. This data +set is still a few years old though: newer datasets will tend to have +more cells!

+ + + +
# main data directory
+data_dir <- file.path("data", "tabula-muris")
+
+# reference files
+ref_dir <- file.path("data", "reference")
+
+# Path to the single-sample alevin results
+alevin_file <- file.path(data_dir, "alevin-quant",
+                         "10X_P4_3", "alevin", "quants_mat.gz")
+
+# Mitochondrial gene table
+mito_file <- file.path(ref_dir,
+                       "mm_mitochondrial_genes.tsv")
+
+# create the output directory using fs::dir_create()
+filtered_dir <- file.path(data_dir, "filtered")
+fs::dir_create(filtered_dir)
+
+# Output file
+filtered_sce_file <- file.path(filtered_dir, "filtered_sce.rds")
+ + + +
+
+
+

Importing alevin results with tximeta

+

tximeta needs a data frame with at least these two +columns: - a files column with the file paths to the +quant.mat.gz files - a names column with the sample +names

+

In this case, we are only importing a single experiment, so we will +create a data frame with only one row.

+ + + +
coldata <- data.frame(files = alevin_file,
+                      names = "10X_P4_3")
+ + + +

Using the coldata data frame that we set up, we can now +run the tximeta() to import our expression data while +automatically finding and associating the transcript annotations that +were used when we performed the quantification.

+

The first time you run tximeta() you may get a message +about storing downloaded transcriptome data in a cache directory so that +it can retrieve the data more quickly the next time. We recommend you +use the cache, and accept the default location.

+ + + +
# Read in alevin results with tximeta
+bladder_sce <- tximeta(coldata, type = "alevin")
+ + +
importing quantifications
+ + +
importing alevin data is much faster after installing 'eds'
+ + +
reading in alevin gene-level counts across cells 
+ + +
found matching transcriptome:
+[ Ensembl - Mus musculus - release 95 ]
+ + +
useHub=TRUE: checking for EnsDb via 'AnnotationHub'
+ + +
found matching EnsDb via 'AnnotationHub'
+ + +
downloading 1 resources
+ + +
retrieving 1 resource
+ + +
loading from cache
+ + +
require("ensembldb")
+ + +
generating gene ranges
+ + +
generating gene ranges
+ + + +

A quick aside! When we ran alevinQC on this data in the +last notebook, we saw that salmon alevin had identified a +“whitelist” of barcodes that passed its quality control standards. We +could use this filtered list directly, but salmon alevin +can be quite strict, and methods for filtering quite variable. Instead, +we will use the default behavior of tximeta() and read in +all of the barcodes for which there is a non-zero UMI count (after +barcode correction). If you wanted instead to include only only barcodes +that passed salmon alevin’s filter, you could supply the +additional argument alevinArgs = list(filterBarcodes=TRUE) +to the tximeta() function. Even if you do choose to read in +pre-filtered data, it’s still important to explore the data as we’re +about to do here and potentially filter further based on your +observations, in particular since mapping software’s quality control +measures (spoilers!) don’t always filter based on mitochondrial gene +content.

+

In the intro-to-R-tidyverse module notebook, +01-intro-to-base_R.Rmd, we discuss base R object types, but +there are some ‘special’ object types that are package-specific. +tximeta creates a SummarizedExperiment object +(or more specifically a RangedSummarizedExperiment object), +which is used by many Bioconductor packages to store and process results +from gene expression studies.

+ + + +
# Explore the SummarizedExperiment data
+bladder_sce
+ + +
class: RangedSummarizedExperiment 
+dim: 35429 344 
+metadata(6): tximetaInfo quantInfo ... txomeInfo txdbInfo
+assays(1): counts
+rownames(35429): ENSMUSG00000000001 ENSMUSG00000000003 ...
+  ENSMUSG00000117649 ENSMUSG00000117651
+rowData names(8): gene_id gene_name ... symbol entrezid
+colnames(344): CGGAGTCAGTACGCCC TTGGCAACATGATCCA ... ACGTCAAGTGTAATGA
+  ATTACTCAGAGAACAG
+colData names(0):
+ + + +

The main component we are concerned with for now is the +counts matrix, which is stored as an “assay”, with a row +for each gene and a column for each cell. In this case, we can see there +is information for 35,429 genes, and Alevin reports data for 344 +cells.

+

tximeta also automatically added some annotation +information about each gene, which can be seen by extracting the +rowData table.

+ + + +
# Examine row (gene) metadata
+rowData(bladder_sce)
+ + +
DataFrame with 35429 rows and 8 columns
+                              gene_id   gene_name         gene_biotype
+                          <character> <character>          <character>
+ENSMUSG00000000001 ENSMUSG00000000001       Gnai3       protein_coding
+ENSMUSG00000000003 ENSMUSG00000000003        Pbsn       protein_coding
+ENSMUSG00000000028 ENSMUSG00000000028       Cdc45       protein_coding
+ENSMUSG00000000037 ENSMUSG00000000037       Scml2       protein_coding
+ENSMUSG00000000049 ENSMUSG00000000049        Apoh       protein_coding
+...                               ...         ...                  ...
+ENSMUSG00000117643 ENSMUSG00000117643  AC122453.2 processed_pseudogene
+ENSMUSG00000117644 ENSMUSG00000117644  AC108777.1 processed_pseudogene
+ENSMUSG00000117646 ENSMUSG00000117646  AC122271.3 processed_pseudogene
+ENSMUSG00000117649 ENSMUSG00000117649  AC165087.2 processed_pseudogene
+ENSMUSG00000117651 ENSMUSG00000117651  CT485613.6 processed_pseudogene
+                   seq_coord_system            description
+                        <character>            <character>
+ENSMUSG00000000001       chromosome guanine nucleotide b..
+ENSMUSG00000000003       chromosome probasin [Source:MGI..
+ENSMUSG00000000028       chromosome cell division cycle ..
+ENSMUSG00000000037       chromosome Scm polycomb group p..
+ENSMUSG00000000049       chromosome apolipoprotein H [So..
+...                             ...                    ...
+ENSMUSG00000117643       chromosome Wilms tumour 1-assoc..
+ENSMUSG00000117644       chromosome gametocyte specific ..
+ENSMUSG00000117646       chromosome developmental plurip..
+ENSMUSG00000117649       chromosome heterogeneous nuclea..
+ENSMUSG00000117651       chromosome NSE1 homolog, SMC5-S..
+                         gene_id_version      symbol entrezid
+                             <character> <character>   <list>
+ENSMUSG00000000001  ENSMUSG00000000001.4       Gnai3    14679
+ENSMUSG00000000003 ENSMUSG00000000003.15        Pbsn    54192
+ENSMUSG00000000028 ENSMUSG00000000028.15       Cdc45    12544
+ENSMUSG00000000037 ENSMUSG00000000037.16       Scml2   107815
+ENSMUSG00000000049 ENSMUSG00000000049.11        Apoh    11818
+...                                  ...         ...      ...
+ENSMUSG00000117643  ENSMUSG00000117643.1  AC122453.2       NA
+ENSMUSG00000117644  ENSMUSG00000117644.1  AC108777.1       NA
+ENSMUSG00000117646  ENSMUSG00000117646.1  AC122271.3       NA
+ENSMUSG00000117649  ENSMUSG00000117649.1  AC165087.2       NA
+ENSMUSG00000117651  ENSMUSG00000117651.1  CT485613.6       NA
+ + + +

We could leave the object as it is, but we can unlock some extra +functionality by converting this from a +SummarizedExperiment object to a +SingleCellExperiment, so we will go ahead and do that next. +SingleCellExperiment objects are a subtype of +SummarizedExperiment objects that a lot of single-cell +analysis R packages use, so we will try to get acquainted with them.

+

For more information on SingleCellExperiment objects, as +well as many other topics related to this course, we highly recommend +the e-book Orchestrating +Single-Cell Analysis with Bioconductor (OSCA) and/or Amezquita +et al. (2020).

+

Below is a figure from OSCA that shows the general structure of +SingleCellExperiment objects.

+

+

Note that three are slots for raw data, metadata about cells, +metadata about genes or features, and slots for various transformations +of the input data. Many of these will not be filled in when we first +create the object, but as we proceed through the workshop we will add in +more data to these slots as we compute new summaries and +transformations.

+

To perform the conversion to a SingleCellExperiment, we +will use the R function as(), which “coerces” objects from +one type to another.

+ + + +
# Convert the SummarizedExperiment to a SingleCellExperiment
+bladder_sce <- as(bladder_sce, "SingleCellExperiment")
+bladder_sce
+ + +
class: SingleCellExperiment 
+dim: 35429 344 
+metadata(6): tximetaInfo quantInfo ... txomeInfo txdbInfo
+assays(1): counts
+rownames(35429): ENSMUSG00000000001 ENSMUSG00000000003 ...
+  ENSMUSG00000117649 ENSMUSG00000117651
+rowData names(8): gene_id gene_name ... symbol entrezid
+colnames(344): CGGAGTCAGTACGCCC TTGGCAACATGATCCA ... ACGTCAAGTGTAATGA
+  ATTACTCAGAGAACAG
+colData names(0):
+reducedDimNames(0):
+mainExpName: NULL
+altExpNames(0):
+ + + +

Doing this added a couple of (currently empty) slots for things like +dimensionality reduction results and alternative feature experiments. +Foreshadowing!

+
+
+

Summarizing expression

+

For a first pass at the data, we will extract just the counts matrix +from the SingleCellExperiment object, and use some base R +functions to look at our results.

+

We can extract the gene by cell count matrix using the +counts() function. This actually returns a special format +of matrix called a “sparse” matrix. Since single cell count data is +mostly zeros, this format (a dgCMatrix object) allows R to +save a lot of memory. This object takes up about 6.4 MB, but if we +stored it in the normal format, it would be closer to 100 MB! +Thankfully, most of the functions that we use to work with regular +matrices work just fine with these as well.

+ + + +
sc_counts <- counts(bladder_sce)
+ + + +

Let’s look at the mean expression of the genes in this dataset. We +will use apply() in order to calculate things across our +data frame. The second argument in apply() specifies +whether we are calculating by rows or columns. (1 = rows, 2 = +columns).

+

In the code chunk below, use apply() with the correct +arguments to calculate the gene means.

+ + + +
# Let's calculate the gene means (by row)
+gene_means <- apply(sc_counts, 1, mean)
+ + + +

This works just fine, but you may have noticed it is a bit slow. For +a few common summary functions like means and sums, R has much more +efficient functions to calculate across rows or columns. In this case, +we can use rowMeans() to do the same calculation much more +quickly.

+ + + +
# use rowMeans() to calculate gene means
+gene_means <- rowMeans(sc_counts)
+ + + +

Let’s make our first density plot with these data. We will use +ggplot() as you have seen before, but since the object we +want to plot, gene_means, is a vector not a data frame, we +will skip the data argument and go straight to the +mapping aesthetics. The remainder of the +ggplot code should look familiar.

+ + + +
# Plot the density of the means using ggplot2
+ggplot(mapping = aes(x = gene_means)) +
+  geom_density() +
+  labs(x = "Mean gene count")
+ + +

+ + + +

That plot is not quite as informative as we might like, as a few +genes with high expression are making the scale just a bit +wide. Lets zoom in on the left part of the graph by adding an +xlim() argument. (Note that xlim() will remove +points outside the specified range, so you will get a warning.)

+ + + +
# Plot the density of the means using ggplot2
+ggplot(mapping = aes(x = gene_means)) +
+  geom_density() +
+  labs(x = "Mean gene count") +
+  xlim(0, 5)
+ + +
Warning: Removed 203 rows containing non-finite outside the scale range
+(`stat_density()`).
+ + +

+ + + +

Even as we zoom in, the counts data has many zeroes, which we +actually expect in a single cell RNA-seq experiment.

+

Let’s calculate what proportion of the count data is zeros:

+ + + +
sum(sc_counts == 0)/(nrow(sc_counts) * ncol(sc_counts))
+ + +
[1] 0.9447591
+ + + +
+
+

Quality control measures for the counts matrix

+

The small amount of RNA in a single cell results in higher chances of +errors and biases in RNA isolation, amplification, and sequencing. We +should check that the overall data we observe for each sample/cell are +reasonable before proceeding too far.

+

The next section explores some of the ways we can filter the data set +to clean things up before we continue to downstream analysis.

+
+QC and filtering +
QC and filtering
+
+
+

Total counts as a quality measure

+

First, lets look at the total number of counts per cell, across all +genes. For this we will use colSums(), as each column +represents a different sampled cell.

+ + + +
# Make a vector of total_counts number of counts per sample using colSums()
+total_counts <- colSums(sc_counts)
+ + + + + + +
# Take a look at the summary statistics for the total counts
+summary(total_counts)
+ + +
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
+    1.0   287.8  3971.5  8089.9 12008.8 62446.0 
+ + + +

Yikes, at least one of the cells has only 1 read!, compared to the +median of ~4000! It’s highly likely that this ‘cell’ is either an empty +well or did not get sequenced properly.

+

Let’s visualize the distribution of total counts to see if this is +the only cell we might want to exclude.

+

In following graphs, we will use vertical red lines to indicate +possible cutoffs.

+ + + +
# Let's use the same kind of plot as above but add more layers
+ggplot(mapping = aes(x = total_counts)) +
+  geom_density(fill = "lightblue") +
+  geom_vline(xintercept = 1000, color = "red") +
+  labs(x = "Counts per cell")
+ + +

+ + + +

How many cells would be removed with this (or other cutoffs) for +counts per sample?

+ + + +
# Calculate the number of cells that would be removed with a given cutoff
+count_cutoff <- 1000
+sum(total_counts <= count_cutoff)
+ + +
[1] 133
+ + + +
+
+

Number of genes a cell expressed as a quality measure

+

What if a single gene accounted for all counts in a particular cell? +This cell would not have helpful data for us, so we should look to +remove any cells we suspect might not have a useful amount of its +transcriptome measured.

+

But before we can determine how many genes we consider a particular +cell to be expressing we need to determine a numeric cutoff for what we +consider to be a detected gene. How many counts must there be for you to +consider a gene expressed? Here let’s go for a simple detection cutoff +of > 0.

+ + + +
# make a detection_mat matrix that is TRUE when a gene is expressed in a sample
+detection_mat <- sc_counts > 0
+ + + +

Now that we have turned our data into a matrix of +TRUE/FALSE for detection, we can sum this data by column to +effectively get a vector of how many genes were measured in each +cell.

+ + + +
# Make a vector that contains the number of genes expressed by a particular cell
+num_genes_exp <- colSums(detection_mat)
+ + + +

Let’s plot this using the same style and type of graph as above.

+ + + +
ggplot(mapping = aes(x = num_genes_exp)) +
+  geom_density(fill = "lightblue") +
+  labs(x = "Number of genes expressed") +
+  theme_classic()
+ + +

+ + + +

This plot helps us visualize the distribution of genes per cell and +can help inform how we choose the cutoff. It’s important to remember +that different cell types can have quite different patterns with regards +to number of genes expressed. If we were to use strict cutoffs to select +which cells are “valid”, there is the possibility that we could bias our +results, so this is something we want to be careful about.

+

Let’s see what happens if we only keep cells with > 500 expressed +genes. Just like when we looked at total counts, we can add in a +vertical line to the previous plot where the possible cutoff would +be.

+ + + +
ggplot(mapping = aes(x = num_genes_exp)) +
+  geom_density(fill = "lightblue") +
+  labs(x = "Number of genes expressed") +
+  theme_classic() +
+  geom_vline(xintercept = 500, color = "red")
+ + +

+ + + +

How many cells would be removed with this cutoff?

+ + + +
# Calculate the number of cells that would be removed with a given cutoff
+gene_cutoff <- 500
+sum(num_genes_exp <= gene_cutoff)
+ + +
[1] 145
+ + + +
+
+

Mitochondrial gene expression

+

If a cell is dead or dying, its mRNA will tend to leak out of the +cell, leaving an overabundance of mitochondrial RNA, which is more +likely to stay within the mitochondria longer. To look for this, we +would like to calculate the fraction of mitochondrial expression for +each cell as well. First, we will need a list of the mitochondrial +genes, which we have prepared in a tsv file +mm_mitochondrial_genes.tsv that we will now read in, and +filter to just the genes that are found in the data set.

+ + + +
# read `mm_mitochondrial_genes.tsv` from ref_dir and
+# create from it a single vector containing only the gene ids
+mito_genes <- readr::read_tsv(mito_file) |>
+  # filter to only gene in the sce object
+  dplyr::filter(gene_id %in% rownames(bladder_sce)) |>
+  # pull takes this column out of the data frame as a stand-alone vector
+  dplyr::pull(gene_id)
+ + +
Rows: 37 Columns: 13
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr (9): gene_id, gene_name, seqnames, strand, gene_biotype, seq_coord_syste...
+dbl (4): start, end, width, entrezid
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

Now we can use the genes from that list to select only the rows of +the count matrix that correspond to the mitochondrial genes and sum +their expression for each sample.

+ + + +
# create a mito_rows vector that is TRUE for mitochondrial genes in our dataset
+mito_rows <- rownames(sc_counts) %in% mito_genes
+
+# sum the counts from just those genes for all samples
+mito_counts <- colSums(sc_counts[mito_rows, ])
+
+# calculate mito_fraction for all samples
+mito_fraction <- mito_counts/total_counts
+ + + +

Lets make a plot of this distribution as well!

+ + + +
ggplot(mapping = aes(x = mito_fraction)) +
+  geom_density(fill = "lightblue") +
+  labs(x = "Mitchondrial fraction") +
+  geom_vline(xintercept = 0.2, color = "red") +
+  theme_classic()
+ + +

+ + + +

Here, we want to keep cells with a low fraction of reads +corresponding to mitochondrial genes and remove any cells with a high +mitochondrial fraction. Again, it’s important to take this step even if +you started with filtered data, since mapping software like +salmon alevin and Cell Ranger do not usually consider +mitochondrial read percentages when filtering.

+
+
+

Combining sample QC measures

+

Lets put all of the QC measures we have calculated into a single data +frame, so we can look at how they might relate to one another.

+ + + +
# make a data frame with number of genes expressed, total counts, and mito fraction
+qc_df <- data.frame(barcode = names(num_genes_exp),
+                    genes_exp = num_genes_exp,
+                    total_counts = total_counts,
+                    mito_fraction = mito_fraction)
+ + + +

Now we can plot these measures all together, along with some possible +cutoffs.

+ + + +
ggplot(qc_df, aes (x = total_counts,
+                   y = genes_exp,
+                   color = mito_fraction)) +
+  geom_point(alpha = 0.5) +
+  scale_color_viridis_c() +
+  geom_vline(xintercept = 1000, color = "red") +
+  geom_hline(yintercept = 500, color = "red") +
+  labs(x = "Total Count",
+       y = "Number of Genes Expressed",
+       color = "Mitochondrial\nFraction") +
+  theme_bw()
+ + +

+ + + +

If we want to filter our data based on these measures and cutoffs we +like, we can do this with dplyr::filter() and then select +the resulting columns from the matrix.

+ + + +
# create a filtered_samples data frame from qc_df
+filtered_samples <- qc_df |>
+  dplyr::filter(total_counts > 1000,
+                genes_exp > 500,
+                mito_fraction < 0.2)
+# select only passing samples for bladder_sce_filtered
+sc_counts_filtered <- sc_counts[, filtered_samples$barcode]
+ + + +
+
+
+

Filtering the SingleCellExperiment directly

+
+

Calculating cell QC stats with scater

+

The methods above were nice for demonstrating the kinds of filtering +we might do, but all the steps would certainly be repetitive if we had +to do them for each sample. Thankfully, there are some nice methods that +have been developed in packages like scater to perform them +all at once and add the results to the SingleCellExperiment +object. The advantages of using functions like this are that we can keep +all of the metadata together, filter directly on the object of interest, +avoid a lot of repetition, and in doing so avoid many potential +errors.

+

We will start with the function addPerCellQC(), which +takes a SingleCellExperiment and a list of gene sets that +that we might want to calculate subset information for. In our case, we +will just look at mitochondrial genes again.

+ + + +
bladder_sce <- scater::addPerCellQC(
+  bladder_sce,
+  # a list of named gene subsets that we want stats for
+  # here we are using mitochondrial genes
+  subsets = list(mito = mito_genes)
+  )
+ + + +

The results of these calculations are now stored as a data frame in +the colData slot of the SingleCellExperiment +object, which we can pull out with the colData() function. +(Unfortunately, it is not quite a regular data frame, but we can easily +convert it to one.) Even nicer, we can access the QC data in those +columns directly with just the familiar $ syntax!

+

The calculated statistics include sum, the total UMI +count for the cell, detected, the number of genes detected, +and a few different statistics for each subset that we gave, including +the percent (not fraction!) of all UMIs from the subset. Since the +subset we used was named mito, this column is called +subsets_mito_percent.

+

Using these, we can recreate the plot from before:

+ + + +
# extract the column data and convert to a data frame
+bladder_qc <- data.frame(colData(bladder_sce))
+
+# plot with the qc data frame
+ggplot(bladder_qc, aes (x = sum,
+                        y = detected,
+                       color = subsets_mito_percent)) +
+  geom_point(alpha = 0.5) +
+  scale_color_viridis_c() +
+  labs(x = "Total Count",
+       y = "Number of Genes Expressed",
+       color = "Mitochondrial\nFraction") +
+  theme_bw()
+ + +

+ + + +
+
+

Applying a filter to a SingleCellExperiment

+

Filtering the SingleCellExperiment object is done as if +it were just the counts matrix, with brackets and indexes. While this +will look much like what we did before, it is better, because it will +also keep the filtered QC stats alongside, in case we wanted to revisit +them later. Otherwise, we would have to filter our QC results +separately, which is an easy place for errors to creep in.

+ + + +
# create a boolean vector of QC filters
+cells_to_keep <- bladder_sce$sum > 1000 &
+  bladder_sce$detected > 500 &
+  bladder_sce$subsets_mito_percent < 20
+
+# filter the sce object (cells are columns)
+bladder_sce_filtered <- bladder_sce[, cells_to_keep]
+ + + +

Just to check, we should have the same number of cells in +bladder_sce_filtered as our previous +sc_counts_filtered.

+ + + +
ncol(sc_counts_filtered) == ncol(bladder_sce_filtered)
+ + +
[1] TRUE
+ + + +
+
+
+

Number of cells that express a gene as a quality measure

+

Now we have an idea of what cells we probably want to get rid of. But +what if our data contains genes that we can’t reliably measure in these +cells?

+

We could use our earlier detection_mat to add up how +many cells express each gene, but we will skip straight to the +scater function this time, which is called +addPerFeatureQC(). This will add QC statistics to the +rowData for each gene (alongside the annotation data we +already had there) The columns it adds are the average expression level +of each gene (mean) and the percentage of cells in which it +was detected (detected).

+ + + +
bladder_sce_filtered <- scater::addPerFeatureQC(bladder_sce_filtered)
+ + + +

Let’s make another density plot with the percentage of samples that +express each gene:

+ + + +
# extract the gene information with
+gene_info <- data.frame(rowData(bladder_sce_filtered))
+
+# Plot the detected percentage
+ggplot(gene_info, aes(x = detected) )+
+  geom_density(fill = "lightblue") +
+  labs(x = "Percent of Cells Expressing Each Gene") +
+  theme_classic()
+ + +

+ + + +

How many genes will be excluded if we draw our cutoff at 5% of +cells?

+ + + +
sum(gene_info$detected < 5)
+ + +
[1] 23960
+ + + +

That’s a lot! How do we feel about that?

+ + + +
cutoff <- 2
+# filter bladder_sce_filtered to only genes above a cutoff value
+bladder_sce_filtered <- bladder_sce_filtered[gene_info$detected >= cutoff, ]
+ + + +

How big is the SingleCellExperiment object now?

+ + + +
dim(bladder_sce_filtered)
+ + +
[1] 13648   186
+ + + +
+
+

Save the filtered data

+

We will save the filtered SingleCellExperiment object as +a .rds file for later use.

+ + + +
# Save object to the file filtered_sce_file, which
+# we defined at the top of this notebook
+readr::write_rds(bladder_sce_filtered, file = filtered_sce_file)
+ + + + +
+ +
LS0tCnRpdGxlOiAiU2luZ2xlIGNlbGwgUk5BLXNlcSBxdWFsaXR5IGNvbnRyb2wgYW5kIGZpbHRlcmluZyIKYXV0aG9yOiBDQ0RMIGZvciBBTFNGCmRhdGU6IDIwMjEKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6CiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCiMjIE9iamVjdGl2ZXMKClRoaXMgbm90ZWJvb2sgd2lsbCBkZW1vbnN0cmF0ZSBob3cgdG86CgotIEltcG9ydCBhbGV2aW4gcmVzdWx0cyB3aXRoIGB0eGltZXRhYAotIENhbGN1bGF0ZSBhbmQgZXhhbWluZSBjZWxsIHF1YWxpdHkgbWVhc3VyZXMKCi0tLQoKV2Ugd2lsbCBjb250aW51ZSB3aXRoIHRoZSBUYWJ1bGEgTXVyaXMgZGF0YSBzZXQgdGhhdCB3ZSBzdGFydGVkIHdpdGggaW4gdGhlIHByZXZpb3VzIG5vdGVib29rLgoKIVtSb2FkbWFwOiBQcmVwcm9jZXNzaW5nIGFuZCBJbXBvcnRdKGRpYWdyYW1zL3JvYWRtYXBfc2luZ2xlX3ByZXByb2Nlc3NfYWxldmluLnBuZykKCiMjIFNldCBVcAoKYGBge3Igc2V0dXB9CiMgdHhpbWV0YSBmb3IgaW1wb3J0aW5nIGFsZXZpbiByZXN1bHRzCmxpYnJhcnkodHhpbWV0YSkKCiMgU2luZ2xlQ2VsbEV4cGVyaW1lbnQgcGFja2FnZSBmb3Igb3JnYW5pemluZyBvdXIgcmVzdWx0cwpsaWJyYXJ5KFNpbmdsZUNlbGxFeHBlcmltZW50KQoKIyBHR1Bsb3QyIGZvciB0aGUgcGxvdHMKbGlicmFyeShnZ3Bsb3QyKQpgYGAKCiMjIEltcG9ydCBzaW5nbGUtY2VsbCBSTkEtc2VxIHF1YW50aWZpY2F0aW9uCgoKCiMjIyBEaXJlY3RvcmllcyBhbmQgZmlsZXMKClRoZSBkYXRhIGZpbGVzIHdlIHdpbGwgYmUgdXNpbmcgZm9yIHRoaXMgcGFydCBvZiB0aGUgcHJvamVjdCBhcmUgaW4gdGhlIGBkYXRhL3RhYnVsYS1tdXJpc2Agc3ViZGlyZWN0b3J5IG9mIHRoZSBgc2NSTkEtc2VxYCBkaXJlY3Rvcnkgd2hlcmUgdGhpcyBub3RlYm9vayBpcyBsb2NhdGVkLgoKVGhlIG1haW4gZmlsZXMgd2Ugd2lsbCBiZSB1c2luZyBhdCB0aGlzIHN0YWdlIGFyZSB0aGUgcmVzdWx0cyBmcm9tIG91ciBlYXJsaWVyIHF1YW50aWZpY2F0aW9uLCBsb2NhdGVkIGluIHRoZSBgYWxldmluLXF1YW50YCBzdWJkaXJlY3RvcnkuClJhdGhlciB0aGFuIGp1c3QgdGhlIHN1YnNldCwgd2Ugd2lsbCB1c2UgdGhlIGZ1bGwgZGF0YSBpbiBvcmRlciB0byBnZXQgYSBzb21ld2hhdCBtb3JlIHJlYWxpc3RpYyB2aWV3IG9mIGEgMTB4IGRhdGEgc2V0LgpUaGlzIGRhdGEgc2V0IGlzIHN0aWxsIGEgZmV3IHllYXJzIG9sZCB0aG91Z2g6IG5ld2VyIGRhdGFzZXRzIHdpbGwgdGVuZCB0byBoYXZlIG1vcmUgY2VsbHMhCgpgYGB7ciBmaWxlcGF0aHN9CiMgbWFpbiBkYXRhIGRpcmVjdG9yeQpkYXRhX2RpciA8LSBmaWxlLnBhdGgoImRhdGEiLCAidGFidWxhLW11cmlzIikKCiMgcmVmZXJlbmNlIGZpbGVzCnJlZl9kaXIgPC0gZmlsZS5wYXRoKCJkYXRhIiwgInJlZmVyZW5jZSIpCgojIFBhdGggdG8gdGhlIHNpbmdsZS1zYW1wbGUgYWxldmluIHJlc3VsdHMKYWxldmluX2ZpbGUgPC0gZmlsZS5wYXRoKGRhdGFfZGlyLCAiYWxldmluLXF1YW50IiwKICAgICAgICAgICAgICAgICAgICAgICAgICIxMFhfUDRfMyIsICJhbGV2aW4iLCAicXVhbnRzX21hdC5neiIpCgojIE1pdG9jaG9uZHJpYWwgZ2VuZSB0YWJsZQptaXRvX2ZpbGUgPC0gZmlsZS5wYXRoKHJlZl9kaXIsCiAgICAgICAgICAgICAgICAgICAgICAgIm1tX21pdG9jaG9uZHJpYWxfZ2VuZXMudHN2IikKCiMgY3JlYXRlIHRoZSBvdXRwdXQgZGlyZWN0b3J5IHVzaW5nIGZzOjpkaXJfY3JlYXRlKCkKZmlsdGVyZWRfZGlyIDwtIGZpbGUucGF0aChkYXRhX2RpciwgImZpbHRlcmVkIikKZnM6OmRpcl9jcmVhdGUoZmlsdGVyZWRfZGlyKQoKIyBPdXRwdXQgZmlsZQpmaWx0ZXJlZF9zY2VfZmlsZSA8LSBmaWxlLnBhdGgoZmlsdGVyZWRfZGlyLCAiZmlsdGVyZWRfc2NlLnJkcyIpCmBgYAoKCiMjIEltcG9ydGluZyBhbGV2aW4gcmVzdWx0cyB3aXRoIHR4aW1ldGEKCmB0eGltZXRhYCBuZWVkcyBhIGRhdGEgZnJhbWUgd2l0aCBhdCBsZWFzdCB0aGVzZSB0d28gY29sdW1uczoKLSBhIGBmaWxlc2AgY29sdW1uICB3aXRoIHRoZSBmaWxlIHBhdGhzIHRvIHRoZSBxdWFudC5tYXQuZ3ogZmlsZXMKLSBhIGBuYW1lc2AgY29sdW1uIHdpdGggdGhlIHNhbXBsZSBuYW1lcwoKSW4gdGhpcyBjYXNlLCB3ZSBhcmUgb25seSBpbXBvcnRpbmcgYSBzaW5nbGUgZXhwZXJpbWVudCwgc28gd2Ugd2lsbCBjcmVhdGUgYSBkYXRhIGZyYW1lIHdpdGggb25seSBvbmUgcm93LgoKYGBge3IgbmFtZXNfc2ZfZmlsZXMsIGxpdmUgPSBUUlVFfQpjb2xkYXRhIDwtIGRhdGEuZnJhbWUoZmlsZXMgPSBhbGV2aW5fZmlsZSwKICAgICAgICAgICAgICAgICAgICAgIG5hbWVzID0gIjEwWF9QNF8zIikKYGBgCgpVc2luZyB0aGUgYGNvbGRhdGFgIGRhdGEgZnJhbWUgdGhhdCB3ZSBzZXQgdXAsIHdlIGNhbiBub3cgcnVuIHRoZSBgdHhpbWV0YSgpYCB0byBpbXBvcnQgb3VyIGV4cHJlc3Npb24gZGF0YSB3aGlsZSBhdXRvbWF0aWNhbGx5IGZpbmRpbmcgYW5kIGFzc29jaWF0aW5nIHRoZSB0cmFuc2NyaXB0IGFubm90YXRpb25zIHRoYXQgd2VyZSB1c2VkIHdoZW4gd2UgcGVyZm9ybWVkIHRoZSBxdWFudGlmaWNhdGlvbi4KClRoZSBmaXJzdCB0aW1lIHlvdSBydW4gYHR4aW1ldGEoKWAgeW91IG1heSBnZXQgYSBtZXNzYWdlIGFib3V0IHN0b3JpbmcgZG93bmxvYWRlZCB0cmFuc2NyaXB0b21lIGRhdGEgaW4gYSBjYWNoZSBkaXJlY3Rvcnkgc28gdGhhdCBpdCBjYW4gcmV0cmlldmUgdGhlIGRhdGEgbW9yZSBxdWlja2x5IHRoZSBuZXh0IHRpbWUuCldlIHJlY29tbWVuZCB5b3UgdXNlIHRoZSBjYWNoZSwgYW5kIGFjY2VwdCB0aGUgZGVmYXVsdCBsb2NhdGlvbi4KCgpgYGB7ciByZWFkX2RhdGEsIGxpdmUgPSBUUlVFfQojIFJlYWQgaW4gYWxldmluIHJlc3VsdHMgd2l0aCB0eGltZXRhCmJsYWRkZXJfc2NlIDwtIHR4aW1ldGEoY29sZGF0YSwgdHlwZSA9ICJhbGV2aW4iKQpgYGAKCkEgcXVpY2sgYXNpZGUhCldoZW4gd2UgcmFuIGBhbGV2aW5RQ2Agb24gdGhpcyBkYXRhIGluIHRoZSBsYXN0IG5vdGVib29rLCB3ZSBzYXcgdGhhdCBgc2FsbW9uIGFsZXZpbmAgaGFkIGlkZW50aWZpZWQgYSAid2hpdGVsaXN0IiBvZiBiYXJjb2RlcyB0aGF0IHBhc3NlZCBpdHMgcXVhbGl0eSBjb250cm9sIHN0YW5kYXJkcy4KV2UgY291bGQgdXNlIHRoaXMgZmlsdGVyZWQgbGlzdCBkaXJlY3RseSwgYnV0IGBzYWxtb24gYWxldmluYCBjYW4gYmUgcXVpdGUgc3RyaWN0LCBhbmQgbWV0aG9kcyBmb3IgZmlsdGVyaW5nIHF1aXRlIHZhcmlhYmxlLgpJbnN0ZWFkLCB3ZSB3aWxsIHVzZSB0aGUgZGVmYXVsdCBiZWhhdmlvciBvZiBgdHhpbWV0YSgpYCBhbmQgcmVhZCBpbiBhbGwgb2YgdGhlIGJhcmNvZGVzIGZvciB3aGljaCB0aGVyZSBpcyBhIG5vbi16ZXJvIFVNSSBjb3VudCAoYWZ0ZXIgYmFyY29kZSBjb3JyZWN0aW9uKS4KSWYgeW91IHdhbnRlZCBpbnN0ZWFkIHRvIGluY2x1ZGUgb25seSBvbmx5IGJhcmNvZGVzIHRoYXQgcGFzc2VkIGBzYWxtb24gYWxldmluYCdzIGZpbHRlciwgeW91IGNvdWxkIHN1cHBseSB0aGUgYWRkaXRpb25hbCBhcmd1bWVudCBgYWxldmluQXJncyA9IGxpc3QoZmlsdGVyQmFyY29kZXM9VFJVRSlgIHRvIHRoZSBgdHhpbWV0YSgpYCBmdW5jdGlvbi4gCkV2ZW4gaWYgeW91IGRvIGNob29zZSB0byByZWFkIGluIHByZS1maWx0ZXJlZCBkYXRhLCBpdCdzIHN0aWxsIGltcG9ydGFudCB0byBleHBsb3JlIHRoZSBkYXRhIGFzIHdlJ3JlIGFib3V0IHRvIGRvIGhlcmUgYW5kIHBvdGVudGlhbGx5IGZpbHRlciBmdXJ0aGVyIGJhc2VkIG9uIHlvdXIgb2JzZXJ2YXRpb25zLCBpbiBwYXJ0aWN1bGFyIHNpbmNlIG1hcHBpbmcgc29mdHdhcmUncyBxdWFsaXR5IGNvbnRyb2wgbWVhc3VyZXMgKHNwb2lsZXJzISkgZG9uJ3QgYWx3YXlzIGZpbHRlciBiYXNlZCBvbiBtaXRvY2hvbmRyaWFsIGdlbmUgY29udGVudC4gCgpJbiB0aGUgaW50cm8tdG8tUi10aWR5dmVyc2UgbW9kdWxlIG5vdGVib29rLCBgMDEtaW50cm8tdG8tYmFzZV9SLlJtZGAsIHdlIGRpc2N1c3MgYmFzZSBSIG9iamVjdCB0eXBlcywgYnV0IHRoZXJlIGFyZSBzb21lICdzcGVjaWFsJyBvYmplY3QgdHlwZXMgdGhhdCBhcmUgcGFja2FnZS1zcGVjaWZpYy4KYHR4aW1ldGFgIGNyZWF0ZXMgYSBgU3VtbWFyaXplZEV4cGVyaW1lbnRgIG9iamVjdCAob3IgbW9yZSBzcGVjaWZpY2FsbHkgYSBgUmFuZ2VkU3VtbWFyaXplZEV4cGVyaW1lbnRgIG9iamVjdCksIHdoaWNoIGlzIHVzZWQgYnkgbWFueSBCaW9jb25kdWN0b3IgcGFja2FnZXMgdG8gc3RvcmUgYW5kIHByb2Nlc3MgcmVzdWx0cyBmcm9tIGdlbmUgZXhwcmVzc2lvbiBzdHVkaWVzLgoKYGBge3Igdmlld19zY2RhdGEsIGxpdmUgPSBUUlVFfQojIEV4cGxvcmUgdGhlIFN1bW1hcml6ZWRFeHBlcmltZW50IGRhdGEKYmxhZGRlcl9zY2UKYGBgCgpUaGUgbWFpbiBjb21wb25lbnQgd2UgYXJlIGNvbmNlcm5lZCB3aXRoIGZvciBub3cgaXMgdGhlIGBjb3VudHNgIG1hdHJpeCwgd2hpY2ggaXMgc3RvcmVkIGFzIGFuICJhc3NheSIsIHdpdGggYSByb3cgZm9yIGVhY2ggZ2VuZSBhbmQgYSBjb2x1bW4gZm9yIGVhY2ggY2VsbC4KSW4gdGhpcyBjYXNlLCB3ZSBjYW4gc2VlIHRoZXJlIGlzIGluZm9ybWF0aW9uIGZvciAzNSw0MjkgZ2VuZXMsIGFuZCBBbGV2aW4gcmVwb3J0cyBkYXRhIGZvciAzNDQgY2VsbHMuCgpgdHhpbWV0YWAgYWxzbyBhdXRvbWF0aWNhbGx5IGFkZGVkIHNvbWUgYW5ub3RhdGlvbiBpbmZvcm1hdGlvbiBhYm91dCBlYWNoIGdlbmUsIHdoaWNoIGNhbiBiZSBzZWVuIGJ5IGV4dHJhY3RpbmcgdGhlIGByb3dEYXRhYCB0YWJsZS4KCmBgYHtyIHZpZXdfYW5ub3RhdGlvbiwgbGl2ZSA9IFRSVUV9CiMgRXhhbWluZSByb3cgKGdlbmUpIG1ldGFkYXRhCnJvd0RhdGEoYmxhZGRlcl9zY2UpCmBgYAoKV2UgY291bGQgbGVhdmUgdGhlIG9iamVjdCBhcyBpdCBpcywgYnV0IHdlIGNhbiB1bmxvY2sgc29tZSBleHRyYSBmdW5jdGlvbmFsaXR5IGJ5IGNvbnZlcnRpbmcgdGhpcyBmcm9tIGEgYFN1bW1hcml6ZWRFeHBlcmltZW50YCBvYmplY3QgdG8gYSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgLCBzbyB3ZSB3aWxsIGdvIGFoZWFkIGFuZCBkbyB0aGF0IG5leHQuCmBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0cyBhcmUgYSBzdWJ0eXBlIG9mIGBTdW1tYXJpemVkRXhwZXJpbWVudGAgb2JqZWN0cyB0aGF0IGEgbG90IG9mIHNpbmdsZS1jZWxsIGFuYWx5c2lzIFIgcGFja2FnZXMgdXNlLCBzbyB3ZSB3aWxsIHRyeSB0byBnZXQgYWNxdWFpbnRlZCB3aXRoIHRoZW0uCgpGb3IgbW9yZSBpbmZvcm1hdGlvbiBvbiBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdHMsIGFzIHdlbGwgYXMgbWFueSBvdGhlciB0b3BpY3MgcmVsYXRlZCB0byB0aGlzIGNvdXJzZSwgd2UgaGlnaGx5IHJlY29tbWVuZCB0aGUgZS1ib29rIFtfT3JjaGVzdHJhdGluZyBTaW5nbGUtQ2VsbCBBbmFseXNpcyB3aXRoIEJpb2NvbmR1Y3Rvcl8gKE9TQ0EpXShodHRwOi8vYmlvY29uZHVjdG9yLm9yZy9ib29rcy8zLjE2L09TQ0EvKSBhbmQvb3IgW0FtZXpxdWl0YSAqZXQgYWwuKiAoMjAyMCldKGh0dHBzOi8vd3d3Lm5hdHVyZS5jb20vYXJ0aWNsZXMvczQxNTkyLTAxOS0wNjU0LXgpLgoKQmVsb3cgaXMgYSBmaWd1cmUgZnJvbSBPU0NBIHRoYXQgc2hvd3MgdGhlIGdlbmVyYWwgc3RydWN0dXJlIG9mIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0cy4KCiFbXShkaWFncmFtcy9TaW5nbGVDZWxsRXhwZXJpbWVudC5wbmcpCgpOb3RlIHRoYXQgdGhyZWUgYXJlIHNsb3RzIGZvciByYXcgZGF0YSwgbWV0YWRhdGEgYWJvdXQgY2VsbHMsIG1ldGFkYXRhIGFib3V0IGdlbmVzIG9yIGZlYXR1cmVzLCBhbmQgc2xvdHMgZm9yIHZhcmlvdXMgdHJhbnNmb3JtYXRpb25zIG9mIHRoZSBpbnB1dCBkYXRhLgpNYW55IG9mIHRoZXNlIHdpbGwgbm90IGJlIGZpbGxlZCBpbiB3aGVuIHdlIGZpcnN0IGNyZWF0ZSB0aGUgb2JqZWN0LCBidXQgYXMgd2UgcHJvY2VlZCB0aHJvdWdoIHRoZSB3b3Jrc2hvcCB3ZSB3aWxsIGFkZCBpbiBtb3JlIGRhdGEgdG8gdGhlc2Ugc2xvdHMgYXMgd2UgY29tcHV0ZSBuZXcgc3VtbWFyaWVzIGFuZCB0cmFuc2Zvcm1hdGlvbnMuCgpUbyBwZXJmb3JtIHRoZSBjb252ZXJzaW9uIHRvIGEgYFNpbmdsZUNlbGxFeHBlcmltZW50YCwgd2Ugd2lsbCB1c2UgdGhlIFIgZnVuY3Rpb24gYGFzKClgLCB3aGljaCAiY29lcmNlcyIgb2JqZWN0cyBmcm9tIG9uZSB0eXBlIHRvIGFub3RoZXIuCgpgYGB7ciBjb252ZXJ0X3NjZSwgbGl2ZSA9IFRSVUV9CiMgQ29udmVydCB0aGUgU3VtbWFyaXplZEV4cGVyaW1lbnQgdG8gYSBTaW5nbGVDZWxsRXhwZXJpbWVudApibGFkZGVyX3NjZSA8LSBhcyhibGFkZGVyX3NjZSwgIlNpbmdsZUNlbGxFeHBlcmltZW50IikKYmxhZGRlcl9zY2UKYGBgCgpEb2luZyB0aGlzIGFkZGVkIGEgY291cGxlIG9mIChjdXJyZW50bHkgZW1wdHkpIHNsb3RzIGZvciB0aGluZ3MgbGlrZSBkaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24gcmVzdWx0cyBhbmQgYWx0ZXJuYXRpdmUgZmVhdHVyZSBleHBlcmltZW50cy4gRm9yZXNoYWRvd2luZyEKCiMjIFN1bW1hcml6aW5nIGV4cHJlc3Npb24KCkZvciBhIGZpcnN0IHBhc3MgYXQgdGhlIGRhdGEsIHdlIHdpbGwgZXh0cmFjdCBqdXN0IHRoZSBjb3VudHMgbWF0cml4IGZyb20gdGhlIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0LCBhbmQgdXNlIHNvbWUgYmFzZSBSIGZ1bmN0aW9ucyB0byBsb29rIGF0IG91ciByZXN1bHRzLgoKV2UgY2FuIGV4dHJhY3QgdGhlIGdlbmUgYnkgY2VsbCBjb3VudCBtYXRyaXggdXNpbmcgdGhlIGBjb3VudHMoKWAgZnVuY3Rpb24uClRoaXMgYWN0dWFsbHkgcmV0dXJucyBhIHNwZWNpYWwgZm9ybWF0IG9mIG1hdHJpeCBjYWxsZWQgYSAic3BhcnNlIiBtYXRyaXguClNpbmNlIHNpbmdsZSBjZWxsIGNvdW50IGRhdGEgaXMgbW9zdGx5IHplcm9zLCB0aGlzIGZvcm1hdCAoYSBgZGdDTWF0cml4YCBvYmplY3QpIGFsbG93cyBSIHRvIHNhdmUgYSBsb3Qgb2YgbWVtb3J5LgpUaGlzIG9iamVjdCB0YWtlcyB1cCBhYm91dCA2LjQgTUIsIGJ1dCBpZiB3ZSBzdG9yZWQgaXQgaW4gdGhlIG5vcm1hbCBmb3JtYXQsIGl0IHdvdWxkIGJlIGNsb3NlciB0byAxMDAgTUIhClRoYW5rZnVsbHksIG1vc3Qgb2YgdGhlIGZ1bmN0aW9ucyB0aGF0IHdlIHVzZSB0byB3b3JrIHdpdGggcmVndWxhciBtYXRyaWNlcyB3b3JrIGp1c3QgZmluZSB3aXRoIHRoZXNlIGFzIHdlbGwuCgpgYGB7ciBtYWtlX21hdHJpeH0Kc2NfY291bnRzIDwtIGNvdW50cyhibGFkZGVyX3NjZSkKYGBgCgpMZXQncyBsb29rIGF0IHRoZSBtZWFuIGV4cHJlc3Npb24gb2YgdGhlIGdlbmVzIGluIHRoaXMgZGF0YXNldC4KV2Ugd2lsbCB1c2UgYGFwcGx5KClgIGluIG9yZGVyIHRvIGNhbGN1bGF0ZSB0aGluZ3MgYWNyb3NzIG91ciBkYXRhIGZyYW1lLgpUaGUgc2Vjb25kIGFyZ3VtZW50IGluIGBhcHBseSgpYCBzcGVjaWZpZXMgd2hldGhlciB3ZSBhcmUgY2FsY3VsYXRpbmcgYnkgcm93cyBvciBjb2x1bW5zLgooMSA9IHJvd3MsIDIgPSBjb2x1bW5zKS4KCkluIHRoZSBjb2RlIGNodW5rIGJlbG93LCB1c2UgYGFwcGx5KClgIHdpdGggdGhlIGNvcnJlY3QgYXJndW1lbnRzIHRvIGNhbGN1bGF0ZSB0aGUgZ2VuZSBtZWFucy4KCmBgYHtyIG1lYW5zLCBsaXZlID0gVFJVRX0KIyBMZXQncyBjYWxjdWxhdGUgdGhlIGdlbmUgbWVhbnMgKGJ5IHJvdykKZ2VuZV9tZWFucyA8LSBhcHBseShzY19jb3VudHMsIDEsIG1lYW4pCmBgYAoKVGhpcyB3b3JrcyBqdXN0IGZpbmUsIGJ1dCB5b3UgbWF5IGhhdmUgbm90aWNlZCBpdCBpcyBhIGJpdCBzbG93LgpGb3IgYSBmZXcgY29tbW9uIHN1bW1hcnkgZnVuY3Rpb25zIGxpa2UgbWVhbnMgYW5kIHN1bXMsIFIgaGFzIG11Y2ggbW9yZSBlZmZpY2llbnQgZnVuY3Rpb25zIHRvIGNhbGN1bGF0ZSBhY3Jvc3Mgcm93cyBvciBjb2x1bW5zLgpJbiB0aGlzIGNhc2UsIHdlIGNhbiB1c2UgYHJvd01lYW5zKClgIHRvIGRvIHRoZSBzYW1lIGNhbGN1bGF0aW9uIG11Y2ggbW9yZSBxdWlja2x5LgoKCmBgYHtyIHJvd21lYW5zfQojIHVzZSByb3dNZWFucygpIHRvIGNhbGN1bGF0ZSBnZW5lIG1lYW5zCmdlbmVfbWVhbnMgPC0gcm93TWVhbnMoc2NfY291bnRzKQpgYGAKCkxldCdzIG1ha2Ugb3VyIGZpcnN0IGRlbnNpdHkgcGxvdCB3aXRoIHRoZXNlIGRhdGEuCldlIHdpbGwgdXNlIGBnZ3Bsb3QoKWAgYXMgeW91IGhhdmUgc2VlbiBiZWZvcmUsIGJ1dCBzaW5jZSB0aGUgb2JqZWN0IHdlIHdhbnQgdG8gcGxvdCwgYGdlbmVfbWVhbnNgLCBpcyBhIHZlY3RvciBub3QgYSBkYXRhIGZyYW1lLCB3ZSB3aWxsIHNraXAgdGhlIGBkYXRhYCBhcmd1bWVudCBhbmQgZ28gc3RyYWlnaHQgdG8gdGhlIGBtYXBwaW5nYCBhZXN0aGV0aWNzLgpUaGUgcmVtYWluZGVyIG9mIHRoZSBgZ2dwbG90YCBjb2RlIHNob3VsZCBsb29rIGZhbWlsaWFyLgoKYGBge3IgbWVhbl9kZW5zaXR5fQojIFBsb3QgdGhlIGRlbnNpdHkgb2YgdGhlIG1lYW5zIHVzaW5nIGdncGxvdDIKZ2dwbG90KG1hcHBpbmcgPSBhZXMoeCA9IGdlbmVfbWVhbnMpKSArCiAgZ2VvbV9kZW5zaXR5KCkgKwogIGxhYnMoeCA9ICJNZWFuIGdlbmUgY291bnQiKQpgYGAKClRoYXQgcGxvdCBpcyBub3QgcXVpdGUgYXMgaW5mb3JtYXRpdmUgYXMgd2UgbWlnaHQgbGlrZSwgYXMgYSBmZXcgZ2VuZXMgd2l0aCBoaWdoIGV4cHJlc3Npb24gYXJlIG1ha2luZyB0aGUgc2NhbGUganVzdCBhICpiaXQqIHdpZGUuCkxldHMgem9vbSBpbiBvbiB0aGUgbGVmdCBwYXJ0IG9mIHRoZSBncmFwaCBieSBhZGRpbmcgYW4gYHhsaW0oKWAgYXJndW1lbnQuCihOb3RlIHRoYXQgYHhsaW0oKWAgd2lsbCByZW1vdmUgcG9pbnRzIG91dHNpZGUgdGhlIHNwZWNpZmllZCByYW5nZSwgc28geW91IHdpbGwgZ2V0IGEgd2FybmluZy4pCgpgYGB7ciB6b29tX2RlbnNpdHksIGxpdmUgPSBUUlVFfQojIFBsb3QgdGhlIGRlbnNpdHkgb2YgdGhlIG1lYW5zIHVzaW5nIGdncGxvdDIKZ2dwbG90KG1hcHBpbmcgPSBhZXMoeCA9IGdlbmVfbWVhbnMpKSArCiAgZ2VvbV9kZW5zaXR5KCkgKwogIGxhYnMoeCA9ICJNZWFuIGdlbmUgY291bnQiKSArCiAgeGxpbSgwLCA1KQpgYGAKCkV2ZW4gYXMgd2Ugem9vbSBpbiwgdGhlIGNvdW50cyBkYXRhIGhhcyBtYW55IHplcm9lcywgd2hpY2ggd2UgYWN0dWFsbHkgZXhwZWN0IGluIGEgc2luZ2xlIGNlbGwgUk5BLXNlcSBleHBlcmltZW50LgoKTGV0J3MgY2FsY3VsYXRlIHdoYXQgcHJvcG9ydGlvbiBvZiB0aGUgY291bnQgZGF0YSBpcyB6ZXJvczoKCmBgYHtyIHplcm9fZnJhY3Rpb24sIGxpdmUgPSBUUlVFfQpzdW0oc2NfY291bnRzID09IDApLyhucm93KHNjX2NvdW50cykgKiBuY29sKHNjX2NvdW50cykpCmBgYAoKCiMjIFF1YWxpdHkgY29udHJvbCBtZWFzdXJlcyBmb3IgdGhlIGNvdW50cyBtYXRyaXgKClRoZSBzbWFsbCBhbW91bnQgb2YgUk5BIGluIGEgc2luZ2xlIGNlbGwgcmVzdWx0cyBpbiBoaWdoZXIgY2hhbmNlcyBvZiBlcnJvcnMgYW5kIGJpYXNlcyBpbiBSTkEgaXNvbGF0aW9uLCBhbXBsaWZpY2F0aW9uLCBhbmQgc2VxdWVuY2luZy4KV2Ugc2hvdWxkIGNoZWNrIHRoYXQgdGhlIG92ZXJhbGwgZGF0YSB3ZSBvYnNlcnZlIGZvciBlYWNoIHNhbXBsZS9jZWxsIGFyZSByZWFzb25hYmxlIGJlZm9yZSBwcm9jZWVkaW5nIHRvbyBmYXIuCgpUaGUgbmV4dCBzZWN0aW9uIGV4cGxvcmVzIHNvbWUgb2YgdGhlIHdheXMgd2UgY2FuIGZpbHRlciB0aGUgZGF0YSBzZXQgdG8gY2xlYW4gdGhpbmdzIHVwIGJlZm9yZSB3ZSBjb250aW51ZSB0byBkb3duc3RyZWFtIGFuYWx5c2lzLgoKIVtRQyBhbmQgZmlsdGVyaW5nXShkaWFncmFtcy9yb2FkbWFwX3NpbmdsZV9xY19ub3JtX2FsZXZpbi5wbmcpCgojIyMjIFRvdGFsIGNvdW50cyBhcyBhIHF1YWxpdHkgbWVhc3VyZQoKRmlyc3QsIGxldHMgbG9vayBhdCB0aGUgdG90YWwgbnVtYmVyIG9mIGNvdW50cyBwZXIgY2VsbCwgYWNyb3NzIGFsbCBnZW5lcy4KRm9yIHRoaXMgd2Ugd2lsbCB1c2UgYGNvbFN1bXMoKWAsIGFzIGVhY2ggY29sdW1uIHJlcHJlc2VudHMgYSBkaWZmZXJlbnQgc2FtcGxlZCBjZWxsLgoKYGBge3IgdG90YWxfY291bnRzLCBsaXZlID0gVFJVRX0KIyBNYWtlIGEgdmVjdG9yIG9mIHRvdGFsX2NvdW50cyBudW1iZXIgb2YgY291bnRzIHBlciBzYW1wbGUgdXNpbmcgY29sU3VtcygpCnRvdGFsX2NvdW50cyA8LSBjb2xTdW1zKHNjX2NvdW50cykKYGBgCgoKYGBge3IgY291bnRzX3N1bW1hcnksIGxpdmUgPSBUUlVFfQojIFRha2UgYSBsb29rIGF0IHRoZSBzdW1tYXJ5IHN0YXRpc3RpY3MgZm9yIHRoZSB0b3RhbCBjb3VudHMKc3VtbWFyeSh0b3RhbF9jb3VudHMpCmBgYAoKWWlrZXMsIGF0IGxlYXN0IG9uZSBvZiB0aGUgY2VsbHMgaGFzIG9ubHkgMSByZWFkISwgY29tcGFyZWQgdG8gdGhlIG1lZGlhbiBvZiB+NDAwMCEKSXQncyBoaWdobHkgbGlrZWx5IHRoYXQgdGhpcyAnY2VsbCcgaXMgZWl0aGVyIGFuIGVtcHR5IHdlbGwgb3IgZGlkIG5vdCBnZXQgc2VxdWVuY2VkIHByb3Blcmx5LgoKTGV0J3MgdmlzdWFsaXplIHRoZSBkaXN0cmlidXRpb24gb2YgdG90YWwgY291bnRzIHRvIHNlZSBpZiB0aGlzIGlzIHRoZSBvbmx5IGNlbGwgd2UgbWlnaHQgd2FudCB0byBleGNsdWRlLgoKSW4gZm9sbG93aW5nIGdyYXBocywgd2Ugd2lsbCB1c2UgdmVydGljYWwgcmVkIGxpbmVzIHRvIGluZGljYXRlIHBvc3NpYmxlIGN1dG9mZnMuCgpgYGB7ciB0b3RhbF9jb3VudHNfcGxvdCwgbGl2ZSA9IFRSVUV9CiMgTGV0J3MgdXNlIHRoZSBzYW1lIGtpbmQgb2YgcGxvdCBhcyBhYm92ZSBidXQgYWRkIG1vcmUgbGF5ZXJzCmdncGxvdChtYXBwaW5nID0gYWVzKHggPSB0b3RhbF9jb3VudHMpKSArCiAgZ2VvbV9kZW5zaXR5KGZpbGwgPSAibGlnaHRibHVlIikgKwogIGdlb21fdmxpbmUoeGludGVyY2VwdCA9IDEwMDAsIGNvbG9yID0gInJlZCIpICsKICBsYWJzKHggPSAiQ291bnRzIHBlciBjZWxsIikKYGBgCgpIb3cgbWFueSBjZWxscyB3b3VsZCBiZSByZW1vdmVkIHdpdGggdGhpcyAob3Igb3RoZXIgY3V0b2ZmcykgZm9yIGNvdW50cyBwZXIgc2FtcGxlPwoKYGBge3IgY291bnRfY3V0b2Zmc30KIyBDYWxjdWxhdGUgdGhlIG51bWJlciBvZiBjZWxscyB0aGF0IHdvdWxkIGJlIHJlbW92ZWQgd2l0aCBhIGdpdmVuIGN1dG9mZgpjb3VudF9jdXRvZmYgPC0gMTAwMApzdW0odG90YWxfY291bnRzIDw9IGNvdW50X2N1dG9mZikKYGBgCgoKIyMjIE51bWJlciBvZiBnZW5lcyBhIGNlbGwgZXhwcmVzc2VkIGFzIGEgcXVhbGl0eSBtZWFzdXJlCgpXaGF0IGlmIGEgc2luZ2xlIGdlbmUgYWNjb3VudGVkIGZvciBhbGwgY291bnRzIGluIGEgcGFydGljdWxhciBjZWxsPwpUaGlzIGNlbGwgd291bGQgbm90IGhhdmUgaGVscGZ1bCBkYXRhIGZvciB1cywgc28gd2Ugc2hvdWxkIGxvb2sgdG8gcmVtb3ZlIGFueSBjZWxscyB3ZSBzdXNwZWN0IG1pZ2h0IG5vdCBoYXZlIGEgdXNlZnVsIGFtb3VudCBvZiBpdHMgdHJhbnNjcmlwdG9tZSBtZWFzdXJlZC4KCkJ1dCBiZWZvcmUgd2UgY2FuIGRldGVybWluZSBob3cgbWFueSBnZW5lcyB3ZSBjb25zaWRlciBhIHBhcnRpY3VsYXIgY2VsbCB0byBiZSBleHByZXNzaW5nIHdlIG5lZWQgdG8gZGV0ZXJtaW5lIGEgbnVtZXJpYyBjdXRvZmYgZm9yIHdoYXQgd2UgY29uc2lkZXIgdG8gYmUgYSBkZXRlY3RlZCBnZW5lLgpIb3cgbWFueSBjb3VudHMgbXVzdCB0aGVyZSBiZSBmb3IgeW91IHRvIGNvbnNpZGVyIGEgZ2VuZSBleHByZXNzZWQ/CkhlcmUgbGV0J3MgZ28gZm9yIGEgc2ltcGxlIGRldGVjdGlvbiBjdXRvZmYgb2YgPiAwLgoKYGBge3IgZGV0ZWN0aW9uX21hdHJpeCwgbGl2ZT1UUlVFfQojIG1ha2UgYSBkZXRlY3Rpb25fbWF0IG1hdHJpeCB0aGF0IGlzIFRSVUUgd2hlbiBhIGdlbmUgaXMgZXhwcmVzc2VkIGluIGEgc2FtcGxlCmRldGVjdGlvbl9tYXQgPC0gc2NfY291bnRzID4gMApgYGAKCk5vdyB0aGF0IHdlIGhhdmUgdHVybmVkIG91ciBkYXRhIGludG8gYSBtYXRyaXggb2YgYFRSVUUvRkFMU0VgIGZvciBkZXRlY3Rpb24sIHdlIGNhbiBzdW0gdGhpcyBkYXRhIGJ5IGNvbHVtbiB0byBlZmZlY3RpdmVseSBnZXQgYSB2ZWN0b3Igb2YgaG93IG1hbnkgZ2VuZXMgd2VyZSBtZWFzdXJlZCBpbiBlYWNoIGNlbGwuCgpgYGB7ciBnZW5lc19leHByZXNzZWQsIGxpdmUgPSBUUlVFfQojIE1ha2UgYSB2ZWN0b3IgdGhhdCBjb250YWlucyB0aGUgbnVtYmVyIG9mIGdlbmVzIGV4cHJlc3NlZCBieSBhIHBhcnRpY3VsYXIgY2VsbApudW1fZ2VuZXNfZXhwIDwtIGNvbFN1bXMoZGV0ZWN0aW9uX21hdCkKYGBgCgpMZXQncyBwbG90IHRoaXMgdXNpbmcgdGhlIHNhbWUgc3R5bGUgYW5kIHR5cGUgb2YgZ3JhcGggYXMgYWJvdmUuCgpgYGB7ciBnZW5lc19leHByZXNzZWRfcGxvdH0KZ2dwbG90KG1hcHBpbmcgPSBhZXMoeCA9IG51bV9nZW5lc19leHApKSArCiAgZ2VvbV9kZW5zaXR5KGZpbGwgPSAibGlnaHRibHVlIikgKwogIGxhYnMoeCA9ICJOdW1iZXIgb2YgZ2VuZXMgZXhwcmVzc2VkIikgKwogIHRoZW1lX2NsYXNzaWMoKQpgYGAKClRoaXMgcGxvdCBoZWxwcyB1cyB2aXN1YWxpemUgdGhlIGRpc3RyaWJ1dGlvbiBvZiBnZW5lcyBwZXIgY2VsbCBhbmQgY2FuIGhlbHAgaW5mb3JtIGhvdyB3ZSBjaG9vc2UgdGhlIGN1dG9mZi4KSXQncyBpbXBvcnRhbnQgdG8gcmVtZW1iZXIgdGhhdCBkaWZmZXJlbnQgY2VsbCB0eXBlcyBjYW4gaGF2ZSBxdWl0ZSBkaWZmZXJlbnQgcGF0dGVybnMgd2l0aCByZWdhcmRzIHRvIG51bWJlciBvZiBnZW5lcyBleHByZXNzZWQuCklmIHdlIHdlcmUgdG8gdXNlIHN0cmljdCBjdXRvZmZzIHRvIHNlbGVjdCB3aGljaCBjZWxscyBhcmUgInZhbGlkIiwgdGhlcmUgaXMgdGhlIHBvc3NpYmlsaXR5IHRoYXQgd2UgY291bGQgYmlhcyBvdXIgcmVzdWx0cywgc28gdGhpcyBpcyBzb21ldGhpbmcgd2Ugd2FudCB0byBiZSBjYXJlZnVsIGFib3V0LgoKTGV0J3Mgc2VlIHdoYXQgaGFwcGVucyBpZiB3ZSBvbmx5IGtlZXAgY2VsbHMgd2l0aCA+IDUwMCBleHByZXNzZWQgZ2VuZXMuCkp1c3QgbGlrZSB3aGVuIHdlIGxvb2tlZCBhdCB0b3RhbCBjb3VudHMsIHdlIGNhbiBhZGQgaW4gYSB2ZXJ0aWNhbCBsaW5lIHRvIHRoZSBwcmV2aW91cyBwbG90IHdoZXJlIHRoZSBwb3NzaWJsZSBjdXRvZmYgd291bGQgYmUuCgpgYGB7ciBnZW5lc19leHByZXNzZWRfY3V0b2ZmLCBsaXZlID0gVFJVRX0KZ2dwbG90KG1hcHBpbmcgPSBhZXMoeCA9IG51bV9nZW5lc19leHApKSArCiAgZ2VvbV9kZW5zaXR5KGZpbGwgPSAibGlnaHRibHVlIikgKwogIGxhYnMoeCA9ICJOdW1iZXIgb2YgZ2VuZXMgZXhwcmVzc2VkIikgKwogIHRoZW1lX2NsYXNzaWMoKSArCiAgZ2VvbV92bGluZSh4aW50ZXJjZXB0ID0gNTAwLCBjb2xvciA9ICJyZWQiKQpgYGAKSG93IG1hbnkgY2VsbHMgd291bGQgYmUgcmVtb3ZlZCB3aXRoIHRoaXMgY3V0b2ZmPwoKYGBge3IgY291bnRfZ2VuZV9jdXRvZmZzLCBsaXZlID0gVFJVRX0KIyBDYWxjdWxhdGUgdGhlIG51bWJlciBvZiBjZWxscyB0aGF0IHdvdWxkIGJlIHJlbW92ZWQgd2l0aCBhIGdpdmVuIGN1dG9mZgpnZW5lX2N1dG9mZiA8LSA1MDAKc3VtKG51bV9nZW5lc19leHAgPD0gZ2VuZV9jdXRvZmYpCmBgYAoKIyMjIE1pdG9jaG9uZHJpYWwgZ2VuZSBleHByZXNzaW9uCgpJZiBhIGNlbGwgaXMgZGVhZCBvciBkeWluZywgaXRzIG1STkEgd2lsbCB0ZW5kIHRvIGxlYWsgb3V0IG9mIHRoZSBjZWxsLCBsZWF2aW5nIGFuIG92ZXJhYnVuZGFuY2Ugb2YgbWl0b2Nob25kcmlhbCBSTkEsIHdoaWNoIGlzIG1vcmUgbGlrZWx5IHRvIHN0YXkgd2l0aGluIHRoZSBtaXRvY2hvbmRyaWEgbG9uZ2VyLgpUbyBsb29rIGZvciB0aGlzLCB3ZSB3b3VsZCBsaWtlIHRvIGNhbGN1bGF0ZSB0aGUgZnJhY3Rpb24gb2YgbWl0b2Nob25kcmlhbCBleHByZXNzaW9uIGZvciBlYWNoIGNlbGwgYXMgd2VsbC4KRmlyc3QsIHdlIHdpbGwgbmVlZCBhIGxpc3Qgb2YgdGhlIG1pdG9jaG9uZHJpYWwgZ2VuZXMsIHdoaWNoIHdlIGhhdmUgcHJlcGFyZWQgaW4gYSB0c3YgZmlsZSBgbW1fbWl0b2Nob25kcmlhbF9nZW5lcy50c3ZgIHRoYXQgd2Ugd2lsbCBub3cgcmVhZCBpbiwgYW5kIGZpbHRlciB0byBqdXN0IHRoZSBnZW5lcyB0aGF0IGFyZSBmb3VuZCBpbiB0aGUgZGF0YSBzZXQuCgoKYGBge3IgcmVhZF9taXRvfQojIHJlYWQgYG1tX21pdG9jaG9uZHJpYWxfZ2VuZXMudHN2YCBmcm9tIHJlZl9kaXIgYW5kCiMgY3JlYXRlIGZyb20gaXQgYSBzaW5nbGUgdmVjdG9yIGNvbnRhaW5pbmcgb25seSB0aGUgZ2VuZSBpZHMKbWl0b19nZW5lcyA8LSByZWFkcjo6cmVhZF90c3YobWl0b19maWxlKSB8PgogICMgZmlsdGVyIHRvIG9ubHkgZ2VuZSBpbiB0aGUgc2NlIG9iamVjdAogIGRwbHlyOjpmaWx0ZXIoZ2VuZV9pZCAlaW4lIHJvd25hbWVzKGJsYWRkZXJfc2NlKSkgfD4KICAjIHB1bGwgdGFrZXMgdGhpcyBjb2x1bW4gb3V0IG9mIHRoZSBkYXRhIGZyYW1lIGFzIGEgc3RhbmQtYWxvbmUgdmVjdG9yCiAgZHBseXI6OnB1bGwoZ2VuZV9pZCkKYGBgCgpOb3cgd2UgY2FuIHVzZSB0aGUgZ2VuZXMgZnJvbSB0aGF0IGxpc3QgdG8gc2VsZWN0IG9ubHkgdGhlIHJvd3Mgb2YgdGhlIGNvdW50IG1hdHJpeCB0aGF0IGNvcnJlc3BvbmQgdG8gdGhlIG1pdG9jaG9uZHJpYWwgZ2VuZXMgYW5kIHN1bSB0aGVpciBleHByZXNzaW9uIGZvciBlYWNoIHNhbXBsZS4KCmBgYHtyIG1pdG9fZmlsdGVyLCBsaXZlID0gVFJVRX0KIyBjcmVhdGUgYSBtaXRvX3Jvd3MgdmVjdG9yIHRoYXQgaXMgVFJVRSBmb3IgbWl0b2Nob25kcmlhbCBnZW5lcyBpbiBvdXIgZGF0YXNldAptaXRvX3Jvd3MgPC0gcm93bmFtZXMoc2NfY291bnRzKSAlaW4lIG1pdG9fZ2VuZXMKCiMgc3VtIHRoZSBjb3VudHMgZnJvbSBqdXN0IHRob3NlIGdlbmVzIGZvciBhbGwgc2FtcGxlcwptaXRvX2NvdW50cyA8LSBjb2xTdW1zKHNjX2NvdW50c1ttaXRvX3Jvd3MsIF0pCgojIGNhbGN1bGF0ZSBtaXRvX2ZyYWN0aW9uIGZvciBhbGwgc2FtcGxlcwptaXRvX2ZyYWN0aW9uIDwtIG1pdG9fY291bnRzL3RvdGFsX2NvdW50cwpgYGAKCkxldHMgbWFrZSBhIHBsb3Qgb2YgdGhpcyBkaXN0cmlidXRpb24gYXMgd2VsbCEKCmBgYHtyIHBsb3RfbWl0bywgbGl2ZSA9IFRSVUV9CmdncGxvdChtYXBwaW5nID0gYWVzKHggPSBtaXRvX2ZyYWN0aW9uKSkgKwogIGdlb21fZGVuc2l0eShmaWxsID0gImxpZ2h0Ymx1ZSIpICsKICBsYWJzKHggPSAiTWl0Y2hvbmRyaWFsIGZyYWN0aW9uIikgKwogIGdlb21fdmxpbmUoeGludGVyY2VwdCA9IDAuMiwgY29sb3IgPSAicmVkIikgKwogIHRoZW1lX2NsYXNzaWMoKQpgYGAKSGVyZSwgd2Ugd2FudCB0byBrZWVwIGNlbGxzIHdpdGggYSBsb3cgZnJhY3Rpb24gb2YgcmVhZHMgY29ycmVzcG9uZGluZyB0byBtaXRvY2hvbmRyaWFsIGdlbmVzIGFuZCByZW1vdmUgYW55IGNlbGxzIHdpdGggYSBoaWdoIG1pdG9jaG9uZHJpYWwgZnJhY3Rpb24uCkFnYWluLCBpdCdzIGltcG9ydGFudCB0byB0YWtlIHRoaXMgc3RlcCBldmVuIGlmIHlvdSBzdGFydGVkIHdpdGggZmlsdGVyZWQgZGF0YSwgc2luY2UgbWFwcGluZyBzb2Z0d2FyZSBsaWtlIGBzYWxtb24gYWxldmluYCBhbmQgQ2VsbCBSYW5nZXIgZG8gbm90IHVzdWFsbHkgY29uc2lkZXIgbWl0b2Nob25kcmlhbCByZWFkIHBlcmNlbnRhZ2VzIHdoZW4gZmlsdGVyaW5nLgoKIyMjIENvbWJpbmluZyBzYW1wbGUgUUMgbWVhc3VyZXMKCkxldHMgcHV0IGFsbCBvZiB0aGUgUUMgbWVhc3VyZXMgd2UgaGF2ZSBjYWxjdWxhdGVkIGludG8gYSBzaW5nbGUgZGF0YSBmcmFtZSwgc28gd2UgY2FuIGxvb2sgYXQgaG93IHRoZXkgbWlnaHQgcmVsYXRlIHRvIG9uZSBhbm90aGVyLgoKYGBge3IgcWNfZGF0YWZyYW1lLCBsaXZlID0gVFJVRX0KIyBtYWtlIGEgZGF0YSBmcmFtZSB3aXRoIG51bWJlciBvZiBnZW5lcyBleHByZXNzZWQsIHRvdGFsIGNvdW50cywgYW5kIG1pdG8gZnJhY3Rpb24KcWNfZGYgPC0gZGF0YS5mcmFtZShiYXJjb2RlID0gbmFtZXMobnVtX2dlbmVzX2V4cCksCiAgICAgICAgICAgICAgICAgICAgZ2VuZXNfZXhwID0gbnVtX2dlbmVzX2V4cCwKICAgICAgICAgICAgICAgICAgICB0b3RhbF9jb3VudHMgPSB0b3RhbF9jb3VudHMsCiAgICAgICAgICAgICAgICAgICAgbWl0b19mcmFjdGlvbiA9IG1pdG9fZnJhY3Rpb24pCgpgYGAKCk5vdyB3ZSBjYW4gcGxvdCB0aGVzZSBtZWFzdXJlcyBhbGwgdG9nZXRoZXIsIGFsb25nIHdpdGggc29tZSBwb3NzaWJsZSBjdXRvZmZzLgoKYGBge3IgcWNfc2NhdHRlcnBsb3R9CmdncGxvdChxY19kZiwgYWVzICh4ID0gdG90YWxfY291bnRzLAogICAgICAgICAgICAgICAgICAgeSA9IGdlbmVzX2V4cCwKICAgICAgICAgICAgICAgICAgIGNvbG9yID0gbWl0b19mcmFjdGlvbikpICsKICBnZW9tX3BvaW50KGFscGhhID0gMC41KSArCiAgc2NhbGVfY29sb3JfdmlyaWRpc19jKCkgKwogIGdlb21fdmxpbmUoeGludGVyY2VwdCA9IDEwMDAsIGNvbG9yID0gInJlZCIpICsKICBnZW9tX2hsaW5lKHlpbnRlcmNlcHQgPSA1MDAsIGNvbG9yID0gInJlZCIpICsKICBsYWJzKHggPSAiVG90YWwgQ291bnQiLAogICAgICAgeSA9ICJOdW1iZXIgb2YgR2VuZXMgRXhwcmVzc2VkIiwKICAgICAgIGNvbG9yID0gIk1pdG9jaG9uZHJpYWxcbkZyYWN0aW9uIikgKwogIHRoZW1lX2J3KCkKYGBgCgpJZiB3ZSB3YW50IHRvIGZpbHRlciBvdXIgZGF0YSBiYXNlZCBvbiB0aGVzZSBtZWFzdXJlcyBhbmQgY3V0b2ZmcyB3ZSBsaWtlLCB3ZSBjYW4gZG8gdGhpcyB3aXRoIGBkcGx5cjo6ZmlsdGVyKClgIGFuZCB0aGVuIHNlbGVjdCB0aGUgcmVzdWx0aW5nIGNvbHVtbnMgZnJvbSB0aGUgbWF0cml4LgoKYGBge3IgcWNfZmlsdGVyLCBsaXZlID0gVFJVRX0KIyBjcmVhdGUgYSBmaWx0ZXJlZF9zYW1wbGVzIGRhdGEgZnJhbWUgZnJvbSBxY19kZgpmaWx0ZXJlZF9zYW1wbGVzIDwtIHFjX2RmIHw+CiAgZHBseXI6OmZpbHRlcih0b3RhbF9jb3VudHMgPiAxMDAwLAogICAgICAgICAgICAgICAgZ2VuZXNfZXhwID4gNTAwLAogICAgICAgICAgICAgICAgbWl0b19mcmFjdGlvbiA8IDAuMikKIyBzZWxlY3Qgb25seSBwYXNzaW5nIHNhbXBsZXMgZm9yIGJsYWRkZXJfc2NlX2ZpbHRlcmVkCnNjX2NvdW50c19maWx0ZXJlZCA8LSBzY19jb3VudHNbLCBmaWx0ZXJlZF9zYW1wbGVzJGJhcmNvZGVdCmBgYAoKIyMgRmlsdGVyaW5nIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIGRpcmVjdGx5CgojIyMgQ2FsY3VsYXRpbmcgY2VsbCBRQyBzdGF0cyB3aXRoIGBzY2F0ZXJgCgpUaGUgbWV0aG9kcyBhYm92ZSB3ZXJlIG5pY2UgZm9yIGRlbW9uc3RyYXRpbmcgdGhlIGtpbmRzIG9mIGZpbHRlcmluZyB3ZSBtaWdodCBkbywgYnV0IGFsbCB0aGUgc3RlcHMgd291bGQgY2VydGFpbmx5IGJlIHJlcGV0aXRpdmUgaWYgd2UgaGFkIHRvIGRvIHRoZW0gZm9yIGVhY2ggc2FtcGxlLgpUaGFua2Z1bGx5LCB0aGVyZSBhcmUgc29tZSBuaWNlIG1ldGhvZHMgdGhhdCBoYXZlIGJlZW4gZGV2ZWxvcGVkIGluIHBhY2thZ2VzIGxpa2UgYHNjYXRlcmAgdG8gcGVyZm9ybSB0aGVtIGFsbCBhdCBvbmNlIGFuZCBhZGQgdGhlIHJlc3VsdHMgdG8gdGhlIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0LgpUaGUgYWR2YW50YWdlcyBvZiB1c2luZyBmdW5jdGlvbnMgbGlrZSB0aGlzIGFyZSB0aGF0IHdlIGNhbiBrZWVwIGFsbCBvZiB0aGUgbWV0YWRhdGEgdG9nZXRoZXIsIGZpbHRlciBkaXJlY3RseSBvbiB0aGUgb2JqZWN0IG9mIGludGVyZXN0LCBhdm9pZCBhIGxvdCBvZiByZXBldGl0aW9uLCBhbmQgaW4gZG9pbmcgc28gYXZvaWQgbWFueSBwb3RlbnRpYWwgZXJyb3JzLgoKV2Ugd2lsbCBzdGFydCB3aXRoIHRoZSBmdW5jdGlvbiBgYWRkUGVyQ2VsbFFDKClgLCB3aGljaCB0YWtlcyBhIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgYW5kIGEgbGlzdCBvZiBnZW5lIHNldHMgdGhhdCB0aGF0IHdlIG1pZ2h0IHdhbnQgdG8gY2FsY3VsYXRlIHN1YnNldCBpbmZvcm1hdGlvbiBmb3IuCkluIG91ciBjYXNlLCB3ZSB3aWxsIGp1c3QgbG9vayBhdCBtaXRvY2hvbmRyaWFsIGdlbmVzIGFnYWluLgoKYGBge3J9CmJsYWRkZXJfc2NlIDwtIHNjYXRlcjo6YWRkUGVyQ2VsbFFDKAogIGJsYWRkZXJfc2NlLAogICMgYSBsaXN0IG9mIG5hbWVkIGdlbmUgc3Vic2V0cyB0aGF0IHdlIHdhbnQgc3RhdHMgZm9yCiAgIyBoZXJlIHdlIGFyZSB1c2luZyBtaXRvY2hvbmRyaWFsIGdlbmVzCiAgc3Vic2V0cyA9IGxpc3QobWl0byA9IG1pdG9fZ2VuZXMpCiAgKQpgYGAKClRoZSByZXN1bHRzIG9mIHRoZXNlIGNhbGN1bGF0aW9ucyBhcmUgbm93IHN0b3JlZCBhcyBhIGRhdGEgZnJhbWUgaW4gdGhlIGBjb2xEYXRhYCBzbG90IG9mIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCwgd2hpY2ggd2UgY2FuIHB1bGwgb3V0IHdpdGggdGhlIGBjb2xEYXRhKClgIGZ1bmN0aW9uLgooVW5mb3J0dW5hdGVseSwgaXQgaXMgbm90IHF1aXRlIGEgcmVndWxhciBkYXRhIGZyYW1lLCBidXQgd2UgY2FuIGVhc2lseSBjb252ZXJ0IGl0IHRvIG9uZS4pCkV2ZW4gbmljZXIsIHdlIGNhbiBhY2Nlc3MgdGhlIFFDIGRhdGEgaW4gdGhvc2UgY29sdW1ucyBkaXJlY3RseSB3aXRoIGp1c3QgdGhlIGZhbWlsaWFyIGAkYCBzeW50YXghCgpUaGUgY2FsY3VsYXRlZCBzdGF0aXN0aWNzIGluY2x1ZGUgYHN1bWAsIHRoZSB0b3RhbCBVTUkgY291bnQgZm9yIHRoZSBjZWxsLCBgZGV0ZWN0ZWRgLCB0aGUgbnVtYmVyIG9mIGdlbmVzIGRldGVjdGVkLCBhbmQgYSBmZXcgZGlmZmVyZW50IHN0YXRpc3RpY3MgZm9yIGVhY2ggc3Vic2V0IHRoYXQgd2UgZ2F2ZSwgaW5jbHVkaW5nIHRoZSBwZXJjZW50IChub3QgZnJhY3Rpb24hKSBvZiBhbGwgVU1JcyBmcm9tIHRoZSBzdWJzZXQuClNpbmNlIHRoZSBzdWJzZXQgd2UgdXNlZCB3YXMgbmFtZWQgYG1pdG9gLCB0aGlzIGNvbHVtbiBpcyBjYWxsZWQgYHN1YnNldHNfbWl0b19wZXJjZW50YC4KClVzaW5nIHRoZXNlLCB3ZSBjYW4gcmVjcmVhdGUgdGhlIHBsb3QgZnJvbSBiZWZvcmU6CgpgYGB7ciByZXBsb3R9CiMgZXh0cmFjdCB0aGUgY29sdW1uIGRhdGEgYW5kIGNvbnZlcnQgdG8gYSBkYXRhIGZyYW1lCmJsYWRkZXJfcWMgPC0gZGF0YS5mcmFtZShjb2xEYXRhKGJsYWRkZXJfc2NlKSkKCiMgcGxvdCB3aXRoIHRoZSBxYyBkYXRhIGZyYW1lCmdncGxvdChibGFkZGVyX3FjLCBhZXMgKHggPSBzdW0sCiAgICAgICAgICAgICAgICAgICAgICAgIHkgPSBkZXRlY3RlZCwKICAgICAgICAgICAgICAgICAgICAgICBjb2xvciA9IHN1YnNldHNfbWl0b19wZXJjZW50KSkgKwogIGdlb21fcG9pbnQoYWxwaGEgPSAwLjUpICsKICBzY2FsZV9jb2xvcl92aXJpZGlzX2MoKSArCiAgbGFicyh4ID0gIlRvdGFsIENvdW50IiwKICAgICAgIHkgPSAiTnVtYmVyIG9mIEdlbmVzIEV4cHJlc3NlZCIsCiAgICAgICBjb2xvciA9ICJNaXRvY2hvbmRyaWFsXG5GcmFjdGlvbiIpICsKICB0aGVtZV9idygpCmBgYAoKIyMjIEFwcGx5aW5nIGEgZmlsdGVyIHRvIGEgYFNpbmdsZUNlbGxFeHBlcmltZW50YAoKRmlsdGVyaW5nIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCBpcyBkb25lIGFzIGlmIGl0IHdlcmUganVzdCB0aGUgY291bnRzIG1hdHJpeCwgd2l0aCBicmFja2V0cyBhbmQgaW5kZXhlcy4KV2hpbGUgdGhpcyB3aWxsIGxvb2sgbXVjaCBsaWtlIHdoYXQgd2UgZGlkIGJlZm9yZSwgaXQgaXMgYmV0dGVyLCBiZWNhdXNlIGl0IHdpbGwgYWxzbyBrZWVwIHRoZSBmaWx0ZXJlZCBRQyBzdGF0cyBhbG9uZ3NpZGUsIGluIGNhc2Ugd2Ugd2FudGVkIHRvIHJldmlzaXQgdGhlbSBsYXRlci4KT3RoZXJ3aXNlLCB3ZSB3b3VsZCBoYXZlIHRvIGZpbHRlciBvdXIgUUMgcmVzdWx0cyBzZXBhcmF0ZWx5LCB3aGljaCBpcyBhbiBlYXN5IHBsYWNlIGZvciBlcnJvcnMgdG8gY3JlZXAgaW4uCgpgYGB7ciBmaWx0ZXJfc2NlfQojIGNyZWF0ZSBhIGJvb2xlYW4gdmVjdG9yIG9mIFFDIGZpbHRlcnMKY2VsbHNfdG9fa2VlcCA8LSBibGFkZGVyX3NjZSRzdW0gPiAxMDAwICYKICBibGFkZGVyX3NjZSRkZXRlY3RlZCA+IDUwMCAmCiAgYmxhZGRlcl9zY2Ukc3Vic2V0c19taXRvX3BlcmNlbnQgPCAyMAoKIyBmaWx0ZXIgdGhlIHNjZSBvYmplY3QgKGNlbGxzIGFyZSBjb2x1bW5zKQpibGFkZGVyX3NjZV9maWx0ZXJlZCA8LSBibGFkZGVyX3NjZVssIGNlbGxzX3RvX2tlZXBdCmBgYAoKSnVzdCB0byBjaGVjaywgd2Ugc2hvdWxkIGhhdmUgdGhlIHNhbWUgbnVtYmVyIG9mIGNlbGxzIGluIGBibGFkZGVyX3NjZV9maWx0ZXJlZGAgYXMgb3VyIHByZXZpb3VzIGBzY19jb3VudHNfZmlsdGVyZWRgLgoKYGBge3IgY2hlY2tfY2VsbF9jb3VudCwgbGl2ZSA9IFRSVUV9Cm5jb2woc2NfY291bnRzX2ZpbHRlcmVkKSA9PSBuY29sKGJsYWRkZXJfc2NlX2ZpbHRlcmVkKQpgYGAKCiMjIE51bWJlciBvZiBjZWxscyB0aGF0IGV4cHJlc3MgYSBnZW5lIGFzIGEgcXVhbGl0eSBtZWFzdXJlCgpOb3cgd2UgaGF2ZSBhbiBpZGVhIG9mIHdoYXQgY2VsbHMgd2UgcHJvYmFibHkgd2FudCB0byBnZXQgcmlkIG9mLgpCdXQgd2hhdCBpZiBvdXIgZGF0YSBjb250YWlucyBnZW5lcyB0aGF0IHdlIGNhbid0IHJlbGlhYmx5IG1lYXN1cmUgaW4gdGhlc2UgY2VsbHM/CgpXZSBjb3VsZCB1c2Ugb3VyIGVhcmxpZXIgYGRldGVjdGlvbl9tYXRgIHRvIGFkZCB1cCBob3cgbWFueSBjZWxscyBleHByZXNzIGVhY2ggZ2VuZSwgYnV0IHdlIHdpbGwgc2tpcCBzdHJhaWdodCB0byB0aGUgYHNjYXRlcmAgZnVuY3Rpb24gdGhpcyB0aW1lLCB3aGljaCBpcyBjYWxsZWQgYGFkZFBlckZlYXR1cmVRQygpYC4KVGhpcyB3aWxsIGFkZCBRQyBzdGF0aXN0aWNzIHRvIHRoZSBgcm93RGF0YWAgZm9yIGVhY2ggZ2VuZSAoYWxvbmdzaWRlIHRoZSBhbm5vdGF0aW9uIGRhdGEgd2UgYWxyZWFkeSBoYWQgdGhlcmUpClRoZSBjb2x1bW5zIGl0IGFkZHMgYXJlIHRoZSBhdmVyYWdlIGV4cHJlc3Npb24gbGV2ZWwgb2YgZWFjaCBnZW5lIChgbWVhbmApIGFuZCB0aGUgcGVyY2VudGFnZSBvZiBjZWxscyBpbiB3aGljaCBpdCB3YXMgZGV0ZWN0ZWQgKGBkZXRlY3RlZGApLgoKYGBge3Igc2FtcGxlX2V4cCwgbGl2ZSA9IFRSVUV9CmJsYWRkZXJfc2NlX2ZpbHRlcmVkIDwtIHNjYXRlcjo6YWRkUGVyRmVhdHVyZVFDKGJsYWRkZXJfc2NlX2ZpbHRlcmVkKQpgYGAKCkxldCdzIG1ha2UgYW5vdGhlciBkZW5zaXR5IHBsb3Qgd2l0aCB0aGUgcGVyY2VudGFnZSBvZiBzYW1wbGVzIHRoYXQgZXhwcmVzcyBlYWNoIGdlbmU6CgpgYGB7ciBzYW1wbGVfZXhwX3Bsb3R9CiMgZXh0cmFjdCB0aGUgZ2VuZSBpbmZvcm1hdGlvbiB3aXRoCmdlbmVfaW5mbyA8LSBkYXRhLmZyYW1lKHJvd0RhdGEoYmxhZGRlcl9zY2VfZmlsdGVyZWQpKQoKIyBQbG90IHRoZSBkZXRlY3RlZCBwZXJjZW50YWdlCmdncGxvdChnZW5lX2luZm8sIGFlcyh4ID0gZGV0ZWN0ZWQpICkrCiAgZ2VvbV9kZW5zaXR5KGZpbGwgPSAibGlnaHRibHVlIikgKwogIGxhYnMoeCA9ICJQZXJjZW50IG9mIENlbGxzIEV4cHJlc3NpbmcgRWFjaCBHZW5lIikgKwogIHRoZW1lX2NsYXNzaWMoKQpgYGAKCkhvdyBtYW55IGdlbmVzIHdpbGwgYmUgZXhjbHVkZWQgaWYgd2UgZHJhdyBvdXIgY3V0b2ZmIGF0IDUlIG9mIGNlbGxzPwoKYGBge3IgZmlsdGVyX2VmZmVjdCwgbGl2ZSA9IFRSVUV9CnN1bShnZW5lX2luZm8kZGV0ZWN0ZWQgPCA1KQpgYGAKClRoYXQncyBhIGxvdCEgSG93IGRvIHdlIGZlZWwgYWJvdXQgdGhhdD8KCmBgYHtyIGZpbHRlcl9nZW5lcywgbGl2ZSA9IFRSVUV9CmN1dG9mZiA8LSAyCiMgZmlsdGVyIGJsYWRkZXJfc2NlX2ZpbHRlcmVkIHRvIG9ubHkgZ2VuZXMgYWJvdmUgYSBjdXRvZmYgdmFsdWUKYmxhZGRlcl9zY2VfZmlsdGVyZWQgPC0gYmxhZGRlcl9zY2VfZmlsdGVyZWRbZ2VuZV9pbmZvJGRldGVjdGVkID49IGN1dG9mZiwgXQpgYGAKCkhvdyBiaWcgaXMgdGhlIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0IG5vdz8KCmBgYHtyIGZpbHRlcmVkX3NpemUsIGxpdmUgPSBUUlVFfQpkaW0oYmxhZGRlcl9zY2VfZmlsdGVyZWQpCmBgYAoKIyMgU2F2ZSB0aGUgZmlsdGVyZWQgZGF0YQoKV2Ugd2lsbCBzYXZlIHRoZSBmaWx0ZXJlZCBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCBhcyBhIGAucmRzYCBmaWxlIGZvciBsYXRlciB1c2UuCgpgYGB7ciBzYXZlX3Jkc30KIyBTYXZlIG9iamVjdCB0byB0aGUgZmlsZSBmaWx0ZXJlZF9zY2VfZmlsZSwgd2hpY2gKIyB3ZSBkZWZpbmVkIGF0IHRoZSB0b3Agb2YgdGhpcyBub3RlYm9vawpyZWFkcjo6d3JpdGVfcmRzKGJsYWRkZXJfc2NlX2ZpbHRlcmVkLCBmaWxlID0gZmlsdGVyZWRfc2NlX2ZpbGUpCmBgYAoKCiMjIyBQcmludCBzZXNzaW9uIGluZm8KCmBgYHtyIHNlc3Npb25pbmZvfQpzZXNzaW9uSW5mbygpCmBgYAo=
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/03-normalizing_scRNA.nb.html b/completed-notebooks/scRNA-seq/03-normalizing_scRNA.nb.html new file mode 100644 index 0000000..fecc288 --- /dev/null +++ b/completed-notebooks/scRNA-seq/03-normalizing_scRNA.nb.html @@ -0,0 +1,3548 @@ + + + + + + + + + + + + + + + +Normalizing scRNA-seq data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Normalize expression counts to better compare expression among +cells
  • +
  • Explore the effects of normalization on variation among cells
  • +
+
+

In this notebook, we’ll continue with processing the same dataset +that we have been working with, moving onto normalization of scRNA-seq +count data that we have already done quality-control analyses of.

+

For this tutorial, we will be using a pair of single-cell analysis +specific R packages: scater and scran to work +with our data. This tutorial is in part based on the scran +tutorial.

+
+Roadmap: QC and filtering +
Roadmap: QC and filtering
+
+
+
+

Set Up

+

Load the libraries we will be using, and set the random number +generation seed value for reproducibility.

+ + + +
# Set seed for reproducibility
+set.seed(1234)
+
+# GGPlot2 for the plots
+library(ggplot2)
+
+# Packages for single cell processing
+library(scater)
+ + +
Loading required package: SingleCellExperiment
+ + +
Loading required package: SummarizedExperiment
+ + +
Loading required package: MatrixGenerics
+ + +
Loading required package: matrixStats
+ + +

+Attaching package: 'MatrixGenerics'
+ + +
The following objects are masked from 'package:matrixStats':
+
+    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
+    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
+    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
+    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
+    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
+    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
+    colWeightedMeans, colWeightedMedians, colWeightedSds,
+    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
+    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
+    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
+    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
+    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
+    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
+    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
+    rowWeightedSds, rowWeightedVars
+ + +
Loading required package: GenomicRanges
+ + +
Loading required package: stats4
+ + +
Loading required package: BiocGenerics
+ + +

+Attaching package: 'BiocGenerics'
+ + +
The following objects are masked from 'package:stats':
+
+    IQR, mad, sd, var, xtabs
+ + +
The following objects are masked from 'package:base':
+
+    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
+    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
+    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
+    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
+    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
+    tapply, union, unique, unsplit, which.max, which.min
+ + +
Loading required package: S4Vectors
+ + +

+Attaching package: 'S4Vectors'
+ + +
The following object is masked from 'package:utils':
+
+    findMatches
+ + +
The following objects are masked from 'package:base':
+
+    expand.grid, I, unname
+ + +
Loading required package: IRanges
+ + +
Loading required package: GenomeInfoDb
+ + +
Loading required package: Biobase
+ + +
Welcome to Bioconductor
+
+    Vignettes contain introductory material; view with
+    'browseVignettes()'. To cite Bioconductor, see
+    'citation("Biobase")', and for packages 'citation("pkgname")'.
+ + +

+Attaching package: 'Biobase'
+ + +
The following object is masked from 'package:MatrixGenerics':
+
+    rowMedians
+ + +
The following objects are masked from 'package:matrixStats':
+
+    anyMissing, rowMedians
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
+ + +
Loading required package: scuttle
+ + +
library(scran)
+ + + +

Now let’s set up the files we will be using:

+ + + +
# main data directory
+data_dir <- file.path("data", "tabula-muris")
+
+# Filtered count matrix file from previous notebook
+filtered_sce_file <- file.path(data_dir, "filtered", "filtered_sce.rds")
+
+# Metadata file location
+metadata_file <- file.path(data_dir, "TM_droplet_metadata.csv")
+
+# Output directory for normalized data
+norm_dir <- file.path(data_dir, "normalized")
+fs::dir_create(norm_dir)
+ + + +
+
+

Read in the filtered count matrix and metadata

+ + + +
bladder_sce <- readr::read_rds(filtered_sce_file)
+sc_metadata <- readr::read_csv(metadata_file)
+ + +
Rows: 70118 Columns: 9
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: ","
+chr (9): cell, channel, mouse.id, tissue, subtissue, mouse.sex, cell_ontolog...
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +
+

Adding more metadata to the SCE object

+

Because the Tabula Muris project is a well-studied data set, we +actually have some cell type information for this data set that we can +refer to.

+

Note that we would normally NOT have this +information until later in the analysis pipeline! Nonetheless, adding it +here will be useful for visualizing the results of our normalization +(and demonstrating how one might add metadata to the +SingleCellExperiment object).

+ + + +
# get the column (cell) metadata (this includes earlier QC stats!)
+# and convert to a data frame
+cell_info <- data.frame(colData(bladder_sce)) |>
+  # convert the row names of this data frame to a separate column
+  tibble::rownames_to_column("barcode")
+
+cell_metadata <- sc_metadata |>
+  # filter to just the sample we are working with
+  dplyr::filter(channel == "10X_P4_3") |>
+  # extract the 16 nt cell barcodes from the `cell` column
+  dplyr::mutate(barcode = stringr::str_sub(cell, start= -16)) |>
+  # choose only the columns we want to add
+  dplyr::select(barcode, cell_ontology_class, free_annotation)
+
+# Join the tables together, using `left_join()` to preserve all rows in cell_info
+cell_info <- cell_info |>
+  dplyr::left_join(cell_metadata)
+ + +
Joining with `by = join_by(barcode)`
+ + + +

Check that the sample info accession ids are still the same as the +columns of our data.

+ + + +
all.equal(cell_info$barcode, colnames(bladder_sce))
+ + +
[1] TRUE
+ + + +

Now we can add that data back to the +SingleCellExperiment object. To keep with the format of +that object, we have to convert our table to a DataFrame +object in order for this to work. Just to keep things confusing, a +DataFrame is not the same as a data.frame that +we have been using throughout. We also need to be sure to include the +row.names argument to keep those properly attached.

+

Note that this will replace all of the previous column (cell) +metadata, which is part of the reason that we pulled out all previous +column data content first.

+ + + +
# add new metadata data back to `bladder_sce`
+colData(bladder_sce) <- DataFrame(cell_info, row.names = cell_info$barcode)
+ + + +
+
+
+

Normalization of count data

+

In whatever data we are working with, we are always looking to +maximize biological variance and minimize technical variance. A primary +source of technical variation we are concerned with is the variation in +library sizes among our samples. While different cells may have +different total transcript counts, it seems more likely that the primary +source of variation that we see is due to library construction, +amplification, and sequencing.

+

This is where normalization methods usually come into the workflow. +The distribution of the counts that we saw in the previous notebook, and +in particular the fact that the count data is noisy with many zero +counts, makes normalization particularly tricky. To handle this noise, +we normalize cells in groups with other cells like them; a method +introduced in Lun +et al. (2016).

+

Briefly, we first cluster the cells to find groups of similar cells, +then compute normalization factors based on the sums of expression in +those groups. The group normalization is then applied back to the +individual cells within the group to create a normalized count matrix. +In this case, we will also log-transform the normalized counts to get a +less skewed distribution of expression measures. Note that because of +the zero counts, the logNormCounts() function will add a +pseudocount of 1 to each value before computing the log.

+ + + +
# Step 1) Group cells with other like cells by clustering.
+qclust <- scran::quickCluster(bladder_sce)
+ + +
Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
+collapsing to unique 'x' values
+ + +
# Step 2) Compute sum factors for each cell cluster grouping.
+bladder_sce <- scran::computeSumFactors(bladder_sce, clusters = qclust)
+
+# Step 3) Normalize using these pooled sum factors and log transform.
+bladder_sce <- scater::logNormCounts(bladder_sce)
+ + + +
+

Compare normalized data to count data

+

One way to determine whether our normalization yields biologically +relevant results is to plot it and see if similarly labeled samples and +cells end up together. Because plotting expression for thousands genes +together isn’t practical, we will reduce the dimensions of our data +using Principal Components Analysis (PCA).

+

We will also make the same plot with our unnormalized data, +to visualize the effect of normalization on our sample. We’ll do this +comparison twice:

+
    +
  • Once coloring the points by their total UMI count
  • +
  • Once coloring the points based on their cell labels
  • +
+

Before plotting the unnormalized data, we will log transform the raw +counts to make their scaling more comparable to the normalized data. To +do this we will use the log1p() function, which is +specifically designed for the case where we want to add 1 to all of our +values before taking the log, as we do here. (We could do something like +log(counts + 1), but this is both more efficient and more +accurate.)

+ + + +
# Use PCA for dimension reduction of cells' scran normalized data
+norm_pca <- scater::calculatePCA(bladder_sce)
+
+# PCA on the raw counts, log transformed
+log_pca <- counts(bladder_sce) |> # get the raw counts
+  log1p() |> # log transform to make these more comparable to the normalized values
+  scater::calculatePCA() # calculate PCA scores
+ + + +

Note that we are using scater::calculatePCA() two +different ways here: once on the full bladder_sce object, +and once on just the counts matrix. When we use +calculatePCA() on the object, it automatically uses the log +normalized matrix from inside the object.

+

Next we will arrange the PCA scores for plotting, adding a column for +each of the total UMI counts and the cell type labels so we can color +each point of the plot.

+ + + +
# Set up the PCA scores for plotting
+norm_pca_scores <- data.frame(norm_pca,
+                              geo_accession = rownames(norm_pca),
+                              total_umi = bladder_sce$sum,
+                              cell_type = bladder_sce$cell_ontology_class)
+log_pca_scores <- data.frame(log_pca,
+                             geo_accession = rownames(log_pca),
+                             total_umi = bladder_sce$sum,
+                             cell_type = bladder_sce$cell_ontology_class)
+ + + +

First, we will plot the unnormalized PCA scores with their total UMI +counts:

+ + + +
# Now plot counts pca
+ggplot(log_pca_scores, aes(x = PC1, y = PC2, color = total_umi)) +
+  geom_point() +
+  labs(title = "Log counts (unnormalized) PCA scores",
+       color = "Total UMI count")  +
+  scale_color_viridis_c() +
+  theme_bw()
+ + +

+ + + +

We’ve plotted the unnormalized data for you. Knowing that we want the +same graph, but different data, use the above template to plot the +normalized data. Feel free to customize the plot with a different theme +or color scheme!

+

Let’s plot the norm_pca_scores data:

+ + + +
ggplot(norm_pca_scores, aes(x = PC1, y = PC2, color = total_umi)) +
+  geom_point() +
+  labs(title = "Normalized log counts PCA scores",
+       color = "Total UMI count") +
+  scale_color_viridis_c() +
+  theme_bw()
+ + +

+ + + +

Do you see an effect from the normalization when comparing these +plots?

+

Now, let’s plot these two sets of PCA scores again, but colored by +cell type. Do you see an effect from the normalization when comparing +these plots?

+ + + +
# First, plot the normalized pca
+ggplot(norm_pca_scores, aes(x = PC1, y = PC2, color = cell_type)) +
+  geom_point() +
+  labs(title = "Normalized log counts PCA scores",
+       color = "Cell Type") +
+  scale_color_brewer(palette = "Dark2", na.value = "grey70") + # add a visually distinct color palette
+  theme_bw()
+ + +

+ + +
# Next, plot log count pca
+ggplot(log_pca_scores, aes(x = PC1, y = PC2, color = cell_type)) +
+  geom_point() +
+  labs(title = "Log counts (unnormalized) PCA scores",
+       color = "Cell Type") +
+  scale_color_brewer(palette = "Dark2", na.value = "grey70") + # add a visually distinct color palette
+  theme_bw()
+ + +

+ + + +
+
+
+

Save the normalized data to tsv file

+

In case we wanted to return to this data later, let’s save the +normalized data to a tsv file. In order to do this we need to extract +our normalized counts from bladder_sce. Refer back to the +SingleCellExperiment figure above to determine why we are +using this logcounts() function.

+

Recall that readr::write_tsv requires a data frame so we +need to convert the logcounts matrix to a data frame. We +will actually have to do this in two steps: first by making the sparse +matrix to a standard R matrix, then converting that to a data frame.

+ + + +
# Save this gene matrix to a tsv file
+logcounts(bladder_sce) |>
+  as.matrix() |>
+  as.data.frame() |>
+  readr::write_tsv(file.path(norm_dir, "scran_norm_gene_matrix.tsv"))
+ + + +

We may want to return to our normalized bladder_sce +object in the future, so we will also save our data in an RDS file so +that we can re-load it into our R environment as a +SingleCellExperiment object.

+ + + +
# Save the data as an RDS
+readr::write_rds(bladder_sce, file.path(norm_dir, "normalized_bladder_sce.rds"))
+ + + + +
+ +
LS0tCnRpdGxlOiAiTm9ybWFsaXppbmcgc2NSTkEtc2VxIGRhdGEiCmF1dGhvcjogQ0NETCBmb3IgQUxTRgpkYXRlOiAyMDIxCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6IHRydWUKLS0tCgojIyBPYmplY3RpdmVzCgpUaGlzIG5vdGVib29rIHdpbGwgZGVtb25zdHJhdGUgaG93IHRvOgoKLSBOb3JtYWxpemUgZXhwcmVzc2lvbiBjb3VudHMgdG8gYmV0dGVyIGNvbXBhcmUgZXhwcmVzc2lvbiBhbW9uZyBjZWxscwotIEV4cGxvcmUgdGhlIGVmZmVjdHMgb2Ygbm9ybWFsaXphdGlvbiBvbiB2YXJpYXRpb24gYW1vbmcgY2VsbHMKCi0tLQoKSW4gdGhpcyBub3RlYm9vaywgd2UnbGwgY29udGludWUgd2l0aCBwcm9jZXNzaW5nIHRoZSBzYW1lIGRhdGFzZXQgdGhhdCB3ZSBoYXZlIGJlZW4gd29ya2luZyB3aXRoLCBtb3Zpbmcgb250byBub3JtYWxpemF0aW9uIG9mIHNjUk5BLXNlcSBjb3VudCBkYXRhIHRoYXQgd2UgaGF2ZSBhbHJlYWR5IGRvbmUgcXVhbGl0eS1jb250cm9sIGFuYWx5c2VzIG9mLgoKRm9yIHRoaXMgdHV0b3JpYWwsIHdlIHdpbGwgYmUgdXNpbmcgYSBwYWlyIG9mIHNpbmdsZS1jZWxsIGFuYWx5c2lzIHNwZWNpZmljClIgcGFja2FnZXM6IGBzY2F0ZXJgIGFuZCBgc2NyYW5gIHRvIHdvcmsgd2l0aCBvdXIgZGF0YS4KVGhpcyB0dXRvcmlhbCBpcyBpbiBwYXJ0IGJhc2VkIG9uIHRoZSBbc2NyYW4KdHV0b3JpYWxdKGh0dHBzOi8vYmlvY29uZHVjdG9yLm9yZy9wYWNrYWdlcy9yZWxlYXNlL2Jpb2MvdmlnbmV0dGVzL3NjcmFuL2luc3QvZG9jL3NjcmFuLmh0bWwpLgoKIVtSb2FkbWFwOiBRQyBhbmQgZmlsdGVyaW5nXShkaWFncmFtcy9yb2FkbWFwX3NpbmdsZV9xY19ub3JtX2FsZXZpbi5wbmcpCgojIyBTZXQgVXAKCkxvYWQgdGhlIGxpYnJhcmllcyB3ZSB3aWxsIGJlIHVzaW5nLCBhbmQgc2V0IHRoZSByYW5kb20gbnVtYmVyIGdlbmVyYXRpb24gc2VlZCB2YWx1ZSBmb3IgcmVwcm9kdWNpYmlsaXR5LgoKYGBge3Igc2V0dXB9CiMgU2V0IHNlZWQgZm9yIHJlcHJvZHVjaWJpbGl0eQpzZXQuc2VlZCgxMjM0KQoKIyBHR1Bsb3QyIGZvciB0aGUgcGxvdHMKbGlicmFyeShnZ3Bsb3QyKQoKIyBQYWNrYWdlcyBmb3Igc2luZ2xlIGNlbGwgcHJvY2Vzc2luZwpsaWJyYXJ5KHNjYXRlcikKbGlicmFyeShzY3JhbikKYGBgCgpOb3cgbGV0J3Mgc2V0IHVwIHRoZSBmaWxlcyB3ZSB3aWxsIGJlIHVzaW5nOgoKYGBge3IgZmlsZXBhdGhzfQojIG1haW4gZGF0YSBkaXJlY3RvcnkKZGF0YV9kaXIgPC0gZmlsZS5wYXRoKCJkYXRhIiwgInRhYnVsYS1tdXJpcyIpCgojIEZpbHRlcmVkIGNvdW50IG1hdHJpeCBmaWxlIGZyb20gcHJldmlvdXMgbm90ZWJvb2sKZmlsdGVyZWRfc2NlX2ZpbGUgPC0gZmlsZS5wYXRoKGRhdGFfZGlyLCAiZmlsdGVyZWQiLCAiZmlsdGVyZWRfc2NlLnJkcyIpCgojIE1ldGFkYXRhIGZpbGUgbG9jYXRpb24KbWV0YWRhdGFfZmlsZSA8LSBmaWxlLnBhdGgoZGF0YV9kaXIsICJUTV9kcm9wbGV0X21ldGFkYXRhLmNzdiIpCgojIE91dHB1dCBkaXJlY3RvcnkgZm9yIG5vcm1hbGl6ZWQgZGF0YQpub3JtX2RpciA8LSBmaWxlLnBhdGgoZGF0YV9kaXIsICJub3JtYWxpemVkIikKZnM6OmRpcl9jcmVhdGUobm9ybV9kaXIpCmBgYAoKCiMjIFJlYWQgaW4gdGhlIGZpbHRlcmVkIGNvdW50IG1hdHJpeCBhbmQgbWV0YWRhdGEKCmBgYHtyIHJlYWRfZGF0YX0KYmxhZGRlcl9zY2UgPC0gcmVhZHI6OnJlYWRfcmRzKGZpbHRlcmVkX3NjZV9maWxlKQpzY19tZXRhZGF0YSA8LSByZWFkcjo6cmVhZF9jc3YobWV0YWRhdGFfZmlsZSkKYGBgCgojIyMgQWRkaW5nIG1vcmUgbWV0YWRhdGEgdG8gdGhlIFNDRSBvYmplY3QKCkJlY2F1c2UgdGhlIFRhYnVsYSBNdXJpcyBwcm9qZWN0IGlzIGEgd2VsbC1zdHVkaWVkIGRhdGEgc2V0LCB3ZSBhY3R1YWxseSBoYXZlIHNvbWUgY2VsbCB0eXBlIGluZm9ybWF0aW9uIGZvciB0aGlzIGRhdGEgc2V0IHRoYXQgd2UgY2FuIHJlZmVyIHRvLgoKTm90ZSB0aGF0IHdlIHdvdWxkIG5vcm1hbGx5ICoqTk9UKiogaGF2ZSB0aGlzIGluZm9ybWF0aW9uIHVudGlsIGxhdGVyIGluIHRoZSBhbmFseXNpcyBwaXBlbGluZSEKTm9uZXRoZWxlc3MsIGFkZGluZyBpdCBoZXJlIHdpbGwgYmUgdXNlZnVsIGZvciB2aXN1YWxpemluZyB0aGUgcmVzdWx0cyBvZiBvdXIgbm9ybWFsaXphdGlvbiAoYW5kIGRlbW9uc3RyYXRpbmcgaG93IG9uZSBtaWdodCBhZGQgbWV0YWRhdGEgdG8gdGhlIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0KS4KCgpgYGB7ciBzYW1wbGVfaW5mb30KIyBnZXQgdGhlIGNvbHVtbiAoY2VsbCkgbWV0YWRhdGEgKHRoaXMgaW5jbHVkZXMgZWFybGllciBRQyBzdGF0cyEpCiMgYW5kIGNvbnZlcnQgdG8gYSBkYXRhIGZyYW1lCmNlbGxfaW5mbyA8LSBkYXRhLmZyYW1lKGNvbERhdGEoYmxhZGRlcl9zY2UpKSB8PgogICMgY29udmVydCB0aGUgcm93IG5hbWVzIG9mIHRoaXMgZGF0YSBmcmFtZSB0byBhIHNlcGFyYXRlIGNvbHVtbgogIHRpYmJsZTo6cm93bmFtZXNfdG9fY29sdW1uKCJiYXJjb2RlIikKCmNlbGxfbWV0YWRhdGEgPC0gc2NfbWV0YWRhdGEgfD4KICAjIGZpbHRlciB0byBqdXN0IHRoZSBzYW1wbGUgd2UgYXJlIHdvcmtpbmcgd2l0aAogIGRwbHlyOjpmaWx0ZXIoY2hhbm5lbCA9PSAiMTBYX1A0XzMiKSB8PgogICMgZXh0cmFjdCB0aGUgMTYgbnQgY2VsbCBiYXJjb2RlcyBmcm9tIHRoZSBgY2VsbGAgY29sdW1uCiAgZHBseXI6Om11dGF0ZShiYXJjb2RlID0gc3RyaW5ncjo6c3RyX3N1YihjZWxsLCBzdGFydD0gLTE2KSkgfD4KICAjIGNob29zZSBvbmx5IHRoZSBjb2x1bW5zIHdlIHdhbnQgdG8gYWRkCiAgZHBseXI6OnNlbGVjdChiYXJjb2RlLCBjZWxsX29udG9sb2d5X2NsYXNzLCBmcmVlX2Fubm90YXRpb24pCgojIEpvaW4gdGhlIHRhYmxlcyB0b2dldGhlciwgdXNpbmcgYGxlZnRfam9pbigpYCB0byBwcmVzZXJ2ZSBhbGwgcm93cyBpbiBjZWxsX2luZm8KY2VsbF9pbmZvIDwtIGNlbGxfaW5mbyB8PgogIGRwbHlyOjpsZWZ0X2pvaW4oY2VsbF9tZXRhZGF0YSkKYGBgCgpDaGVjayB0aGF0IHRoZSBzYW1wbGUgaW5mbyBhY2Nlc3Npb24gaWRzIGFyZSBzdGlsbCB0aGUgc2FtZSBhcyB0aGUgY29sdW1ucyBvZiBvdXIgZGF0YS4KCmBgYHtyIGNoZWNrX3NhbXBsZWluZm8sIGxpdmUgPSBUUlVFfQphbGwuZXF1YWwoY2VsbF9pbmZvJGJhcmNvZGUsIGNvbG5hbWVzKGJsYWRkZXJfc2NlKSkKYGBgCgpOb3cgd2UgY2FuIGFkZCB0aGF0IGRhdGEgYmFjayB0byB0aGUgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBvYmplY3QuClRvIGtlZXAgd2l0aCB0aGUgZm9ybWF0IG9mIHRoYXQgb2JqZWN0LCB3ZSBoYXZlIHRvIGNvbnZlcnQgb3VyIHRhYmxlIHRvIGEgYERhdGFGcmFtZWAgb2JqZWN0IGluIG9yZGVyIGZvciB0aGlzIHRvIHdvcmsuCkp1c3QgdG8ga2VlcCB0aGluZ3MgY29uZnVzaW5nLCBhIGBEYXRhRnJhbWVgIGlzIG5vdCB0aGUgc2FtZSBhcyBhIGBkYXRhLmZyYW1lYCB0aGF0IHdlIGhhdmUgYmVlbiB1c2luZyB0aHJvdWdob3V0LgpXZSBhbHNvIG5lZWQgdG8gYmUgc3VyZSB0byBpbmNsdWRlIHRoZSBgcm93Lm5hbWVzYCBhcmd1bWVudCB0byBrZWVwIHRob3NlIHByb3Blcmx5IGF0dGFjaGVkLgoKTm90ZSB0aGF0IHRoaXMgd2lsbCByZXBsYWNlIGFsbCBvZiB0aGUgcHJldmlvdXMgY29sdW1uIChjZWxsKSBtZXRhZGF0YSwgd2hpY2ggaXMgcGFydCBvZiB0aGUgcmVhc29uIHRoYXQgd2UgcHVsbGVkIG91dCBhbGwgcHJldmlvdXMgY29sdW1uIGRhdGEgY29udGVudCBmaXJzdC4KCmBgYHtyIHJlcGxhY2VfY29sZGF0YSwgbGl2ZSA9IFRSVUV9CiMgYWRkIG5ldyBtZXRhZGF0YSBkYXRhIGJhY2sgdG8gYGJsYWRkZXJfc2NlYApjb2xEYXRhKGJsYWRkZXJfc2NlKSA8LSBEYXRhRnJhbWUoY2VsbF9pbmZvLCByb3cubmFtZXMgPSBjZWxsX2luZm8kYmFyY29kZSkKYGBgCgoKIyMgTm9ybWFsaXphdGlvbiBvZiBjb3VudCBkYXRhCgpJbiB3aGF0ZXZlciBkYXRhIHdlIGFyZSB3b3JraW5nIHdpdGgsIHdlIGFyZSBhbHdheXMgbG9va2luZyB0byBtYXhpbWl6ZSBiaW9sb2dpY2FsIHZhcmlhbmNlIGFuZCBtaW5pbWl6ZSB0ZWNobmljYWwgdmFyaWFuY2UuCkEgcHJpbWFyeSBzb3VyY2Ugb2YgdGVjaG5pY2FsIHZhcmlhdGlvbiB3ZSBhcmUgY29uY2VybmVkIHdpdGggaXMgdGhlIHZhcmlhdGlvbiBpbiBsaWJyYXJ5IHNpemVzIGFtb25nIG91ciBzYW1wbGVzLgpXaGlsZSBkaWZmZXJlbnQgY2VsbHMgbWF5IGhhdmUgZGlmZmVyZW50IHRvdGFsIHRyYW5zY3JpcHQgY291bnRzLCBpdCBzZWVtcyBtb3JlIGxpa2VseSB0aGF0IHRoZSBwcmltYXJ5IHNvdXJjZSBvZiB2YXJpYXRpb24gdGhhdCB3ZSBzZWUgaXMgZHVlIHRvIGxpYnJhcnkgY29uc3RydWN0aW9uLCBhbXBsaWZpY2F0aW9uLCBhbmQgc2VxdWVuY2luZy4KClRoaXMgaXMgd2hlcmUgbm9ybWFsaXphdGlvbiBtZXRob2RzIHVzdWFsbHkgY29tZSBpbnRvIHRoZSB3b3JrZmxvdy4KVGhlIGRpc3RyaWJ1dGlvbiBvZiB0aGUgY291bnRzIHRoYXQgd2Ugc2F3IGluIHRoZSBwcmV2aW91cyBub3RlYm9vaywgYW5kIGluIHBhcnRpY3VsYXIgdGhlIGZhY3QgdGhhdCB0aGUgY291bnQgZGF0YSBpcyBub2lzeSB3aXRoIG1hbnkgemVybyBjb3VudHMsIG1ha2VzIG5vcm1hbGl6YXRpb24gcGFydGljdWxhcmx5IHRyaWNreS4KVG8gaGFuZGxlIHRoaXMgbm9pc2UsIHdlIG5vcm1hbGl6ZSBjZWxscyBpbiBncm91cHMgd2l0aCBvdGhlciBjZWxscyBsaWtlIHRoZW07IGEgbWV0aG9kIGludHJvZHVjZWQgaW4gW0x1biAqZXQgYWwuKiAoMjAxNildKGh0dHBzOi8vZ2Vub21lYmlvbG9neS5iaW9tZWRjZW50cmFsLmNvbS9hcnRpY2xlcy8xMC4xMTg2L3MxMzA1OS0wMTYtMDk0Ny03KS4KCkJyaWVmbHksIHdlIGZpcnN0IGNsdXN0ZXIgdGhlIGNlbGxzIHRvIGZpbmQgZ3JvdXBzIG9mIHNpbWlsYXIgY2VsbHMsIHRoZW4gY29tcHV0ZSBub3JtYWxpemF0aW9uIGZhY3RvcnMgYmFzZWQgb24gdGhlIHN1bXMgb2YgZXhwcmVzc2lvbiBpbiB0aG9zZSBncm91cHMuClRoZSBncm91cCBub3JtYWxpemF0aW9uIGlzIHRoZW4gYXBwbGllZCBiYWNrIHRvIHRoZSBpbmRpdmlkdWFsIGNlbGxzIHdpdGhpbiB0aGUgZ3JvdXAgdG8gY3JlYXRlIGEgbm9ybWFsaXplZCBjb3VudCBtYXRyaXguCkluIHRoaXMgY2FzZSwgd2Ugd2lsbCBhbHNvIGxvZy10cmFuc2Zvcm0gdGhlIG5vcm1hbGl6ZWQgY291bnRzIHRvIGdldCBhIGxlc3Mgc2tld2VkIGRpc3RyaWJ1dGlvbiBvZiBleHByZXNzaW9uIG1lYXN1cmVzLgpOb3RlIHRoYXQgYmVjYXVzZSBvZiB0aGUgemVybyBjb3VudHMsIHRoZSBgbG9nTm9ybUNvdW50cygpYCBmdW5jdGlvbiB3aWxsIGFkZCBhIHBzZXVkb2NvdW50IG9mIDEgdG8gZWFjaCB2YWx1ZSBiZWZvcmUgY29tcHV0aW5nIHRoZSBsb2cuCgpgYGB7ciBzY2Vfbm9ybWFsaXplfQojIFN0ZXAgMSkgR3JvdXAgY2VsbHMgd2l0aCBvdGhlciBsaWtlIGNlbGxzIGJ5IGNsdXN0ZXJpbmcuCnFjbHVzdCA8LSBzY3Jhbjo6cXVpY2tDbHVzdGVyKGJsYWRkZXJfc2NlKQoKIyBTdGVwIDIpIENvbXB1dGUgc3VtIGZhY3RvcnMgZm9yIGVhY2ggY2VsbCBjbHVzdGVyIGdyb3VwaW5nLgpibGFkZGVyX3NjZSA8LSBzY3Jhbjo6Y29tcHV0ZVN1bUZhY3RvcnMoYmxhZGRlcl9zY2UsIGNsdXN0ZXJzID0gcWNsdXN0KQoKIyBTdGVwIDMpIE5vcm1hbGl6ZSB1c2luZyB0aGVzZSBwb29sZWQgc3VtIGZhY3RvcnMgYW5kIGxvZyB0cmFuc2Zvcm0uCmJsYWRkZXJfc2NlIDwtIHNjYXRlcjo6bG9nTm9ybUNvdW50cyhibGFkZGVyX3NjZSkKYGBgCgojIyMgQ29tcGFyZSBub3JtYWxpemVkIGRhdGEgdG8gY291bnQgZGF0YQoKT25lIHdheSB0byBkZXRlcm1pbmUgd2hldGhlciBvdXIgbm9ybWFsaXphdGlvbiB5aWVsZHMgYmlvbG9naWNhbGx5IHJlbGV2YW50IHJlc3VsdHMgaXMgdG8gcGxvdCBpdCBhbmQgc2VlIGlmIHNpbWlsYXJseSBsYWJlbGVkIHNhbXBsZXMgYW5kIGNlbGxzIGVuZCB1cCB0b2dldGhlci4KQmVjYXVzZSBwbG90dGluZyBleHByZXNzaW9uIGZvciB0aG91c2FuZHMgZ2VuZXMgdG9nZXRoZXIgaXNuJ3QgcHJhY3RpY2FsLCB3ZSB3aWxsIHJlZHVjZSB0aGUgZGltZW5zaW9ucyBvZiBvdXIgZGF0YSB1c2luZyBQcmluY2lwYWwgQ29tcG9uZW50cyBBbmFseXNpcyAoUENBKS4KCldlIHdpbGwgYWxzbyBtYWtlIHRoZSBzYW1lIHBsb3Qgd2l0aCBvdXIgKnVubm9ybWFsaXplZCogZGF0YSwgdG8gdmlzdWFsaXplIHRoZSBlZmZlY3Qgb2Ygbm9ybWFsaXphdGlvbiBvbiBvdXIgc2FtcGxlLgpXZSdsbCBkbyB0aGlzIGNvbXBhcmlzb24gdHdpY2U6CgotIE9uY2UgY29sb3JpbmcgdGhlIHBvaW50cyBieSB0aGVpciB0b3RhbCBVTUkgY291bnQKLSBPbmNlIGNvbG9yaW5nIHRoZSBwb2ludHMgYmFzZWQgb24gdGhlaXIgY2VsbCBsYWJlbHMKCkJlZm9yZSBwbG90dGluZyB0aGUgdW5ub3JtYWxpemVkIGRhdGEsIHdlIHdpbGwgbG9nIHRyYW5zZm9ybSB0aGUgcmF3IGNvdW50cyB0byBtYWtlIHRoZWlyIHNjYWxpbmcgbW9yZSBjb21wYXJhYmxlIHRvIHRoZSBub3JtYWxpemVkIGRhdGEuClRvIGRvIHRoaXMgd2Ugd2lsbCB1c2UgdGhlIGBsb2cxcCgpYCBmdW5jdGlvbiwgd2hpY2ggaXMgc3BlY2lmaWNhbGx5IGRlc2lnbmVkIGZvciB0aGUgY2FzZSB3aGVyZSB3ZSB3YW50IHRvIGFkZCAxIHRvIGFsbCBvZiBvdXIgdmFsdWVzIGJlZm9yZSB0YWtpbmcgdGhlIGxvZywgYXMgd2UgZG8gaGVyZS4KKFdlIGNvdWxkIGRvIHNvbWV0aGluZyBsaWtlIGBsb2coY291bnRzICsgMSlgLCBidXQgdGhpcyBpcyBib3RoIG1vcmUgZWZmaWNpZW50IGFuZCBtb3JlIGFjY3VyYXRlLikKCgpgYGB7ciBwY2F9CiMgVXNlIFBDQSBmb3IgZGltZW5zaW9uIHJlZHVjdGlvbiBvZiBjZWxscycgc2NyYW4gbm9ybWFsaXplZCBkYXRhCm5vcm1fcGNhIDwtIHNjYXRlcjo6Y2FsY3VsYXRlUENBKGJsYWRkZXJfc2NlKQoKIyBQQ0Egb24gdGhlIHJhdyBjb3VudHMsIGxvZyB0cmFuc2Zvcm1lZApsb2dfcGNhIDwtIGNvdW50cyhibGFkZGVyX3NjZSkgfD4gIyBnZXQgdGhlIHJhdyBjb3VudHMKICBsb2cxcCgpIHw+ICMgbG9nIHRyYW5zZm9ybSB0byBtYWtlIHRoZXNlIG1vcmUgY29tcGFyYWJsZSB0byB0aGUgbm9ybWFsaXplZCB2YWx1ZXMKICBzY2F0ZXI6OmNhbGN1bGF0ZVBDQSgpICMgY2FsY3VsYXRlIFBDQSBzY29yZXMKCmBgYAoKTm90ZSB0aGF0IHdlIGFyZSB1c2luZyBgc2NhdGVyOjpjYWxjdWxhdGVQQ0EoKWAgdHdvIGRpZmZlcmVudCB3YXlzIGhlcmU6IG9uY2Ugb24gdGhlIGZ1bGwgYGJsYWRkZXJfc2NlYCBvYmplY3QsIGFuZCBvbmNlIG9uIGp1c3QgdGhlIGBjb3VudHNgIG1hdHJpeC4KV2hlbiB3ZSB1c2UgYGNhbGN1bGF0ZVBDQSgpYCBvbiB0aGUgb2JqZWN0LCBpdCBhdXRvbWF0aWNhbGx5IHVzZXMgdGhlIGxvZyBub3JtYWxpemVkIG1hdHJpeCBmcm9tIGluc2lkZSB0aGUgb2JqZWN0LgoKTmV4dCB3ZSB3aWxsIGFycmFuZ2UgdGhlIFBDQSBzY29yZXMgZm9yIHBsb3R0aW5nLCBhZGRpbmcgYSBjb2x1bW4gZm9yIGVhY2ggb2YgdGhlIHRvdGFsIFVNSSBjb3VudHMgYW5kIHRoZSBjZWxsIHR5cGUgbGFiZWxzIHNvIHdlIGNhbiBjb2xvciBlYWNoIHBvaW50IG9mIHRoZSBwbG90LgoKYGBge3IgcGNhX2RmfQojIFNldCB1cCB0aGUgUENBIHNjb3JlcyBmb3IgcGxvdHRpbmcKbm9ybV9wY2Ffc2NvcmVzIDwtIGRhdGEuZnJhbWUobm9ybV9wY2EsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGdlb19hY2Nlc3Npb24gPSByb3duYW1lcyhub3JtX3BjYSksCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRvdGFsX3VtaSA9IGJsYWRkZXJfc2NlJHN1bSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgY2VsbF90eXBlID0gYmxhZGRlcl9zY2UkY2VsbF9vbnRvbG9neV9jbGFzcykKbG9nX3BjYV9zY29yZXMgPC0gZGF0YS5mcmFtZShsb2dfcGNhLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgIGdlb19hY2Nlc3Npb24gPSByb3duYW1lcyhsb2dfcGNhKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0b3RhbF91bWkgPSBibGFkZGVyX3NjZSRzdW0sCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgY2VsbF90eXBlID0gYmxhZGRlcl9zY2UkY2VsbF9vbnRvbG9neV9jbGFzcykKYGBgCgpGaXJzdCwgd2Ugd2lsbCBwbG90IHRoZSB1bm5vcm1hbGl6ZWQgUENBIHNjb3JlcyB3aXRoIHRoZWlyIHRvdGFsIFVNSSBjb3VudHM6CgpgYGB7ciBwY2FfcGxvdH0KIyBOb3cgcGxvdCBjb3VudHMgcGNhCmdncGxvdChsb2dfcGNhX3Njb3JlcywgYWVzKHggPSBQQzEsIHkgPSBQQzIsIGNvbG9yID0gdG90YWxfdW1pKSkgKwogIGdlb21fcG9pbnQoKSArCiAgbGFicyh0aXRsZSA9ICJMb2cgY291bnRzICh1bm5vcm1hbGl6ZWQpIFBDQSBzY29yZXMiLAogICAgICAgY29sb3IgPSAiVG90YWwgVU1JIGNvdW50IikgICsKICBzY2FsZV9jb2xvcl92aXJpZGlzX2MoKSArCiAgdGhlbWVfYncoKQpgYGAKCldlJ3ZlIHBsb3R0ZWQgdGhlIHVubm9ybWFsaXplZCBkYXRhIGZvciB5b3UuCktub3dpbmcgdGhhdCB3ZSB3YW50IHRoZSBzYW1lIGdyYXBoLCBidXQgZGlmZmVyZW50IGRhdGEsIHVzZSB0aGUgYWJvdmUgdGVtcGxhdGUgdG8gcGxvdCB0aGUgbm9ybWFsaXplZCBkYXRhLgpGZWVsIGZyZWUgdG8gY3VzdG9taXplIHRoZSBwbG90IHdpdGggYSBkaWZmZXJlbnQgdGhlbWUgb3IgY29sb3Igc2NoZW1lIQoKTGV0J3MgcGxvdCB0aGUgYG5vcm1fcGNhX3Njb3Jlc2AgZGF0YToKCmBgYHtyIG5vcm1fcGNhX3Bsb3QsIGxpdmUgPSBUUlVFfQpnZ3Bsb3Qobm9ybV9wY2Ffc2NvcmVzLCBhZXMoeCA9IFBDMSwgeSA9IFBDMiwgY29sb3IgPSB0b3RhbF91bWkpKSArCiAgZ2VvbV9wb2ludCgpICsKICBsYWJzKHRpdGxlID0gIk5vcm1hbGl6ZWQgbG9nIGNvdW50cyBQQ0Egc2NvcmVzIiwKICAgICAgIGNvbG9yID0gIlRvdGFsIFVNSSBjb3VudCIpICsKICBzY2FsZV9jb2xvcl92aXJpZGlzX2MoKSArCiAgdGhlbWVfYncoKQpgYGAKCkRvIHlvdSBzZWUgYW4gZWZmZWN0IGZyb20gdGhlIG5vcm1hbGl6YXRpb24gd2hlbiBjb21wYXJpbmcgdGhlc2UgcGxvdHM/CgoKCk5vdywgbGV0J3MgcGxvdCB0aGVzZSB0d28gc2V0cyBvZiBQQ0Egc2NvcmVzIGFnYWluLCBidXQgY29sb3JlZCBieSBjZWxsIHR5cGUuCkRvIHlvdSBzZWUgYW4gZWZmZWN0IGZyb20gdGhlIG5vcm1hbGl6YXRpb24gd2hlbiBjb21wYXJpbmcgdGhlc2UgcGxvdHM/CgpgYGB7ciBjZWxsdHlwZV9wY2FfcGxvdHN9CiMgRmlyc3QsIHBsb3QgdGhlIG5vcm1hbGl6ZWQgcGNhCmdncGxvdChub3JtX3BjYV9zY29yZXMsIGFlcyh4ID0gUEMxLCB5ID0gUEMyLCBjb2xvciA9IGNlbGxfdHlwZSkpICsKICBnZW9tX3BvaW50KCkgKwogIGxhYnModGl0bGUgPSAiTm9ybWFsaXplZCBsb2cgY291bnRzIFBDQSBzY29yZXMiLAogICAgICAgY29sb3IgPSAiQ2VsbCBUeXBlIikgKwogIHNjYWxlX2NvbG9yX2JyZXdlcihwYWxldHRlID0gIkRhcmsyIiwgbmEudmFsdWUgPSAiZ3JleTcwIikgKyAjIGFkZCBhIHZpc3VhbGx5IGRpc3RpbmN0IGNvbG9yIHBhbGV0dGUKICB0aGVtZV9idygpCgojIE5leHQsIHBsb3QgbG9nIGNvdW50IHBjYQpnZ3Bsb3QobG9nX3BjYV9zY29yZXMsIGFlcyh4ID0gUEMxLCB5ID0gUEMyLCBjb2xvciA9IGNlbGxfdHlwZSkpICsKICBnZW9tX3BvaW50KCkgKwogIGxhYnModGl0bGUgPSAiTG9nIGNvdW50cyAodW5ub3JtYWxpemVkKSBQQ0Egc2NvcmVzIiwKICAgICAgIGNvbG9yID0gIkNlbGwgVHlwZSIpICsKICBzY2FsZV9jb2xvcl9icmV3ZXIocGFsZXR0ZSA9ICJEYXJrMiIsIG5hLnZhbHVlID0gImdyZXk3MCIpICsgIyBhZGQgYSB2aXN1YWxseSBkaXN0aW5jdCBjb2xvciBwYWxldHRlCiAgdGhlbWVfYncoKQpgYGAKCgoKIyMgU2F2ZSB0aGUgbm9ybWFsaXplZCBkYXRhIHRvIHRzdiBmaWxlCgpJbiBjYXNlIHdlIHdhbnRlZCB0byByZXR1cm4gdG8gdGhpcyBkYXRhIGxhdGVyLCBsZXQncyBzYXZlIHRoZSBub3JtYWxpemVkIGRhdGEKdG8gYSB0c3YgZmlsZS4KSW4gb3JkZXIgdG8gZG8gdGhpcyB3ZSBuZWVkIHRvIGV4dHJhY3Qgb3VyIG5vcm1hbGl6ZWQgY291bnRzIGZyb20gYGJsYWRkZXJfc2NlYC4KUmVmZXIgYmFjayB0byB0aGUgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBmaWd1cmUgYWJvdmUgdG8gZGV0ZXJtaW5lIHdoeSB3ZSBhcmUgdXNpbmcgdGhpcyBgbG9nY291bnRzKClgIGZ1bmN0aW9uLgoKUmVjYWxsIHRoYXQgYHJlYWRyOjp3cml0ZV90c3ZgIHJlcXVpcmVzIGEgZGF0YSBmcmFtZSBzbyB3ZSBuZWVkIHRvIGNvbnZlcnQgdGhlIGBsb2djb3VudHNgIG1hdHJpeCB0byBhIGRhdGEgZnJhbWUuCldlIHdpbGwgYWN0dWFsbHkgaGF2ZSB0byBkbyB0aGlzIGluIHR3byBzdGVwczogZmlyc3QgYnkgbWFraW5nIHRoZSBzcGFyc2UgbWF0cml4IHRvIGEgc3RhbmRhcmQgUiBtYXRyaXgsIHRoZW4gY29udmVydGluZyB0aGF0IHRvIGEgZGF0YSBmcmFtZS4KCmBgYHtyIHNhdmVfdHN2fQojIFNhdmUgdGhpcyBnZW5lIG1hdHJpeCB0byBhIHRzdiBmaWxlCmxvZ2NvdW50cyhibGFkZGVyX3NjZSkgfD4KICBhcy5tYXRyaXgoKSB8PgogIGFzLmRhdGEuZnJhbWUoKSB8PgogIHJlYWRyOjp3cml0ZV90c3YoZmlsZS5wYXRoKG5vcm1fZGlyLCAic2NyYW5fbm9ybV9nZW5lX21hdHJpeC50c3YiKSkKYGBgCgpXZSBtYXkgd2FudCB0byByZXR1cm4gdG8gb3VyIG5vcm1hbGl6ZWQgYGJsYWRkZXJfc2NlYCBvYmplY3QgaW4gdGhlIGZ1dHVyZSwgc28gd2Ugd2lsbAphbHNvIHNhdmUgb3VyIGRhdGEgaW4gYW4gUkRTIGZpbGUgc28gdGhhdCB3ZSBjYW4gcmUtbG9hZCBpdCBpbnRvIG91ciBSCmVudmlyb25tZW50IGFzIGEgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBvYmplY3QuCgpgYGB7ciBzYXZlX3Jkc30KIyBTYXZlIHRoZSBkYXRhIGFzIGFuIFJEUwpyZWFkcjo6d3JpdGVfcmRzKGJsYWRkZXJfc2NlLCBmaWxlLnBhdGgobm9ybV9kaXIsICJub3JtYWxpemVkX2JsYWRkZXJfc2NlLnJkcyIpKQpgYGAKCgojIyMgUHJpbnQgc2Vzc2lvbiBpbmZvCgpgYGB7ciBzZXNzaW9uaW5mb30Kc2Vzc2lvbkluZm8oKQpgYGAK
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/04-dimension_reduction_scRNA.nb.html b/completed-notebooks/scRNA-seq/04-dimension_reduction_scRNA.nb.html new file mode 100644 index 0000000..195fcef --- /dev/null +++ b/completed-notebooks/scRNA-seq/04-dimension_reduction_scRNA.nb.html @@ -0,0 +1,4076 @@ + + + + + + + + + + + + + + + +Dimension Reduction with scRNA-seq data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Read Cell Ranger data into R
  • +
  • Filter post-quantification cells using +emptyDropsCellRanger()
  • +
  • Apply dimensionality reduction methods to single cell data
  • +
  • Visualize samples in reduced dimensional space
  • +
+
+

In this notebook, we’ll try out some dimension reduction techniques +on single-cell RNA-seq data.

+

Visualizing highly dimensional data is a common challenge in +genomics, and especially with RNA-seq data. The expression of every gene +we look at is another dimension describing a sample. When we also have +hundreds or thousands of individual samples, as in the case of +single-cell analysis, figuring out how to clearly display all of the +data in a meaningful way is difficult.

+

A common practice is to common to use dimension reduction techniques +so all of the data is in a more manageable form for plotting, +clustering, and other downstream analyses.

+
+
+

Set Up

+ + + +
# Load libraries
+library(ggplot2)
+library(scater)
+ + +
Loading required package: SingleCellExperiment
+ + +
Loading required package: SummarizedExperiment
+ + +
Loading required package: MatrixGenerics
+ + +
Loading required package: matrixStats
+ + +

+Attaching package: 'MatrixGenerics'
+ + +
The following objects are masked from 'package:matrixStats':
+
+    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
+    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
+    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
+    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
+    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
+    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
+    colWeightedMeans, colWeightedMedians, colWeightedSds,
+    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
+    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
+    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
+    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
+    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
+    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
+    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
+    rowWeightedSds, rowWeightedVars
+ + +
Loading required package: GenomicRanges
+ + +
Loading required package: stats4
+ + +
Loading required package: BiocGenerics
+ + +

+Attaching package: 'BiocGenerics'
+ + +
The following objects are masked from 'package:stats':
+
+    IQR, mad, sd, var, xtabs
+ + +
The following objects are masked from 'package:base':
+
+    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
+    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
+    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
+    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
+    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
+    tapply, union, unique, unsplit, which.max, which.min
+ + +
Loading required package: S4Vectors
+ + +

+Attaching package: 'S4Vectors'
+ + +
The following object is masked from 'package:utils':
+
+    findMatches
+ + +
The following objects are masked from 'package:base':
+
+    expand.grid, I, unname
+ + +
Loading required package: IRanges
+ + +
Loading required package: GenomeInfoDb
+ + +
Loading required package: Biobase
+ + +
Welcome to Bioconductor
+
+    Vignettes contain introductory material; view with
+    'browseVignettes()'. To cite Bioconductor, see
+    'citation("Biobase")', and for packages 'citation("pkgname")'.
+ + +

+Attaching package: 'Biobase'
+ + +
The following object is masked from 'package:MatrixGenerics':
+
+    rowMedians
+ + +
The following objects are masked from 'package:matrixStats':
+
+    anyMissing, rowMedians
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
+ + +
Loading required package: scuttle
+ + +
library(scran)
+
+# Setting the seed for reproducibility
+set.seed(12345)
+ + + +
+

Directories and files

+

The data we will be using for this module comes from a a 10x Genomics +data set of expression +data from a Hodgkin’s Lymphoma tumor. The data was generated with +the 10Xv3.1 chemistry, and processed with Cell Ranger and 10x Genomics +standard pipeline.

+

There are a variety of files that you will often see as part of the +standard output from Cell Ranger, which are described in detail in 10x +Genomics documentation. We have included some of these in the +data/hodkins/cellranger directory, including the +web_summary.html file that includes some similar QC +statistics to those we generated with alevinQC. The main +file we will be working with are the feature by barcode matrices. Cell +Ranger does some filtering on its own, but we will start with the raw +data.

+ + + +
# main data directory
+data_dir <- file.path("data", "hodgkins")
+# reference files
+ref_dir <- file.path("data", "reference")
+
+# Path to the Cell Ranger matrix
+raw_matrix_dir <- file.path(data_dir, "cellranger",
+                            "raw_feature_bc_matrix")
+
+# Path to mitochondrial genes table
+mito_file <- file.path(ref_dir, "hs_mitochondrial_genes.tsv")
+
+# Directory and file to save output
+normalized_dir <- file.path(data_dir, "normalized")
+fs::dir_create(normalized_dir)
+
+output_sce_file <- file.path(normalized_dir, "normalized_hodgkins_sce.rds")
+ + + +
+
+
+

Reading Cell Ranger data

+

Cell Ranger output includes count data in two main formats. The first +is a folder with a feature list, a barcode list, and a sparse matrix in +“Matrix +Exchange” format. The DropletUtils::read10xCounts() +function takes this directory and reads in the data from these three +files, assembling the SingleCellExperiment object we have +worked with before.

+

Alternatively, we could use the HDF5 format file that +Cell Ranger outputs as a file with the .h5 extension, which +contains the same data. For whatever reason, the way you read the data +affects how it is stored in R. Reading from the directory results in +smaller objects in R, so that is what we will do here.

+

Cell Ranger also outputs both filtered and raw matrices; today we +will start with the raw matrix and perform our own filtering.

+
+Roadmap: Preprocessing and Import +
Roadmap: Preprocessing and Import
+
+ + + +
hodgkins_sce <- DropletUtils::read10xCounts(raw_matrix_dir)
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'HDF5Array'
+ + + +

How many potential cells are there here?

+ + + +
dim(hodgkins_sce)
+ + +
[1]   36601 6794880
+ + + +

That is a lot of cells! In fact, it is really every possible barcode, +whether there were reads associated with it or not. We should probably +do something about that.

+
+
+

QC and normalization

+
+Roadmap: QC and filtering +
Roadmap: QC and filtering
+
+
+

Basic QC stats

+

We will start by calculating the basic QC stats as we have done +previously, adding those to our SingleCellExperiment +object.

+

The first step again is reading in our table of mitochondrial genes +and finding the ones that were quantified our data set.

+ + + +
mito_genes <- readr::read_tsv(mito_file) |>
+  dplyr::filter(gene_id %in% rownames(hodgkins_sce)) |>
+  dplyr::pull(gene_id)
+ + +
Rows: 37 Columns: 13
+── Column specification ────────────────────────────────────────────────────────
+Delimiter: "\t"
+chr (9): gene_id, gene_name, seqnames, strand, gene_biotype, seq_coord_syste...
+dbl (4): start, end, width, entrezid
+
+ℹ Use `spec()` to retrieve the full column specification for this data.
+ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
+ + + +

Next we will calculate the QC stats that we used before. Note that +this is much slower than before, as we have many more genes in the +unfiltered set!

+ + + +
hodgkins_sce <- scater::addPerCellQC(
+  hodgkins_sce,
+  subsets = list(mito = mito_genes))
+ + + +

We can now do the most basic level of filtering: getting rid of +“cells” with no reads.

+ + + +
hodgkins_sce <- hodgkins_sce[, hodgkins_sce$total > 0]
+dim(hodgkins_sce)
+ + +
[1]  36601 549931
+ + + +
+
+

Filtering with emptyDropsCellRanger()

+

The DropletUtils package that we used to read in the 10x +data has a number of other useful features. One is the +emptyDropsCellRanger() function, which uses the overall +gene expression patterns in the sample to identify droplets that are +likely to not contain an intact cell, but may simply have contained +loose ambient RNA released during cell separation. This method was +originally developed by Lun et al. +(2019) and implemented as the function emptyDrops(), +but has since been adapted as the main filtering method used by Cell +Ranger. The emptyDropsCellRanger() function emulates the +variant of this function that used by Cell Ranger, making the results +more comparable between the two methods.

+

The Empty Drops method uses the droplets with very low UMI counts to +estimate the “ambient” expression pattern of RNA content outside of +cells. It then scores the remaining cells based how much they deviate +from that pattern, assigning a small P value when the droplet’s +expression deviates from the ambient expression pattern. Because it uses +the low UMI count droplets, this method should not be used when other +filtering has already been performed (which is unfortunately the case +with the version of salmon alevin we used).

+

This method seems to perform well to exclude false “cells” while +retaining cells with distinct expression profiles but low counts that +might have failed a simple cutoff. Note that this method also requires +that the data has already been quantified, with reads assigned to genes, +as compared to a simple total UMI count filter which can be performed +much earlier in the pipeline.

+

The emptyDropsCellRanger() function takes the +counts matrix from our SingleCellExperiment, and returns a +data frame with the statistics it calculates. This will take a few +minutes to run, but we can speed it up by allowing parallel +processing.

+ + + +
droplet_stats <- DropletUtils::emptyDropsCellRanger(
+  counts(hodgkins_sce),
+  BPPARAM = BiocParallel::MulticoreParam(4)) # use multiprocessing
+ + + +

We will use a false discovery rate (FDR) of 0.01 as our cutoff for +“real” cells. Since emptyDropsCellRanger() uses low count +cells to estimate the “ambient” expression pattern, those cells are not +assigned an FDR value, and have a value of NA. These NAs can be a +problem for filtering with a Boolean vector, as we did above, so instead +we will use the which() function to get the +positions of the cells that pass our filter and select the +columns we want using that.

+ + + +
cells_to_retain <- which(droplet_stats$FDR <= 0.01)
+
+filtered_sce <- hodgkins_sce[, cells_to_retain]
+dim(filtered_sce)
+ + +
[1] 36601  3401
+ + + +

How does this compare to the number of cells in the Cell Ranger +filtered data? Looking the web_summary.html report from +Cell Ranger, it seems that it would have kept 3,394 cells, so we seem to +be getting broadly similar results.

+
+
+

Checking mitochondrial content

+

While emptyDropsCellRanger() should have filtered out +droplets containing no cells, it will not necessarily filter out damaged +cells. For that we will still want to look at mitochondrial content, as +we did previously. The statistics we calculated earlier with +addPerCellQC() are retained in our new object, so we can +plot those directly.

+ + + +
# Plot the mitochondrial percents stored in `filtered_sce`
+ggplot(mapping = aes(x = filtered_sce$subsets_mito_percent)) +
+  geom_histogram(bins = 100)
+ + +

+ + + +

There are certainly some cells with high mitochondrial percentages! +For now, we will use a cutoff of 20% to filter out the worst of the +cells.

+ + + +
filtered_sce <- filtered_sce[, filtered_sce$subsets_mito_percent < 20]
+ + + +

We can also filter by features (genes in our case) using +scater::addPerFeatureQC() which will compute the number of +samples where each gene is detected and the mean count across all genes. +We can then use those data (stored in rowData) to filter by +row to only the genes that are detected in at least 5% of cells, and +with a mean count > 0.1.

+ + + +
filtered_sce <- scater::addPerFeatureQC(filtered_sce)
+detected <- rowData(filtered_sce)$detected > 5
+expressed <- rowData(filtered_sce)$mean > 0.1
+
+# filter the genes (rows) this time
+filtered_sce <- filtered_sce[detected & expressed, ]
+ + + +

How many cells do we have now?

+ + + +
dim(filtered_sce)
+ + +
[1] 7445 2747
+ + + +
+
+

Normalize

+

Now we will perform the same normalization steps we did in a previous +dataset, using scran::computeSumFactors() and +scater::logNormCounts(). You might recall that there is a +bit of randomness in some of these calculations, so we should be sure to +have used set.seed() earlier in the notebook for +reproducibility.

+ + + +
# Cluster similar cells
+qclust <- scran::quickCluster(filtered_sce)
+
+# Compute sum factors for each cell cluster grouping.
+filtered_sce <- scran::computeSumFactors(filtered_sce, clusters = qclust, positive = FALSE)
+ + +
Warning in (function (x, sizes, min.mean = NULL, positive = FALSE, scaling =
+NULL) : encountered non-positive size factor estimates
+ + + +

It turns out in this case we end up with some negative size factors. +This is usually an indication that our filtering was not stringent +enough, and there remain a number of cells or genes with nearly zero +counts. This probably happened when we removed the +infrequently-expressed genes; cells which had high counts from those +particular genes (and few others) could have had their total counts +dramatically reduced.

+

To account for this, we will recalculate the per-cell stats and +filter out low counts. Unfortunately, to do this, we need to first +remove the previously calculated statistics, which we will do by setting +them to NULL.

+ + + +
# remove previous calculations
+filtered_sce$sum <- NULL
+filtered_sce$detected <- NULL
+filtered_sce$total <- NULL
+filtered_sce$subsets_mito_sum <- NULL
+filtered_sce$subsets_mito_detected <- NULL
+filtered_sce$subsets_mito_sum <- NULL
+
+# recalculate cell stats
+filtered_sce <- scater::addPerCellQC(filtered_sce, subsets = list(mito = mito_genes))
+
+# print the number of cells with fewer than 500 UMIs
+sum(filtered_sce$sum < 500)
+ + +
[1] 13
+ + + +

Now we can filter again. In this case, we will keep cells with at +least 500 UMIs after removing the lowly expressed genes. Then we will +redo the size factor calculation, hopefully with no more warnings.

+ + + +
filtered_sce <- filtered_sce[, filtered_sce$sum >= 500]
+
+qclust <- scran::quickCluster(filtered_sce)
+
+filtered_sce <- scran::computeSumFactors(filtered_sce, clusters = qclust)
+ + + +

Looks good! Now we’ll do the normalization.

+ + + +
# Normalize and log transform.
+normalized_sce <- scater::logNormCounts(filtered_sce)
+ + + +

At this point, we have a few different versions of our +SingleCellExperiment object. The original (mostly) +unfiltered version is in hodgkins_sce, the filtered version +in filtered_sce, and the normalized version in +normalized_sce. We can clean those up a bit to save memory, +keeping only the latest normalized_sce version, which now +has two assays: counts with the raw data and +logcounts with the normalized and transformed data.

+ + + +
assayNames(normalized_sce)
+ + +
[1] "counts"    "logcounts"
+ + +
counts
+logcounts
+ + +
rm(hodgkins_sce, filtered_sce)
+ + + +
+
+
+

Dimensionality reduction and display

+
+Roadmap: Dimensionality reduction +
Roadmap: Dimensionality reduction
+
+
+

Principal Components Analysis

+

Principal component analysis (PCA) is a dimensionality reduction +technique that allows us to identify the largest components of variation +in a complex dataset. Our expression data can be thought of as mapping +each sample in a multidimensional space defined by the expression level +of each gene. The expression of many of those genes are correlated, so +we can often get a better, simpler picture of the data by combining the +information from those correlated genes.

+

PCA rotates and transforms this space so that each axis is now a +combination of multiple correlated genes, ordered so the first axes +capture the most variation from the data. These new axes are the +“principal components.” If we look at the first few components, we can +often get a nice overview of relationships among the samples in the +data.

+
+

Storing PCA results with the raw data

+

We will store the PCA results in our +SingleCellExperiment object, as we will want to use them +later. To do this, we will use the runPCA() function from +scater, which performs the PCA calculations and returns a +new object with the results stored in the reducedDim slot. +If we wanted to, we could get the raw results as a matrix instead with +calculatePCA() function, as we did in a previous +notebook.

+

We will also use the ntop argument to calculate the PCA +using 2000 genes with the highest variance. The default is +ntop = 500.

+ + + +
# calculate PCA using the top 2000 genes
+normalized_sce <- runPCA(normalized_sce, ntop = 2000)
+ + + +

We can see what reduced dimensionality matrices are stored in the +object with the reducedDimNames() function.

+ + + +
# print the reduced dimensionality tables available
+reducedDimNames(normalized_sce)
+ + +
[1] "PCA"
+ + +
PCA
+ + + +

To extract them by name, we use the reducedDim() +function, much like the assay() function to extract +original data.

+ + + +
# print the top corner of the PCA matrix
+reducedDim(normalized_sce, "PCA")[1:10, 1:5]
+ + +
            PC1        PC2        PC3         PC4        PC5
+ [1,] 15.379889   7.929537  11.790972   3.9845847   8.232603
+ [2,] -1.220424   6.053443   2.100161 -10.7532071  -6.769888
+ [3,]  0.647626  -7.566098 -12.004547  -0.9862535   8.936328
+ [4,]  7.922958 -21.975167   4.156634 -10.0798546   5.383158
+ [5,]  1.577685 -26.510992  11.089218  10.5259441 -11.660812
+ [6,]  5.676435 -27.571708  13.655945  -9.3006045   2.594320
+ [7,] -3.623704  -2.052362  -9.061048  -5.7941810   9.112434
+ [8,]  9.577588  12.977445  -4.125885  -0.1949252 -10.757970
+ [9,]  9.393483  -7.288896 -12.661552   6.4502827  -2.371315
+[10,] -7.917621  -8.052548  -9.061915   0.8675157   9.034390
+ + + +
+
+

Plotting PCA results

+

If we have the PCA results stored in the +SingleCellExperiment object, we can use the +scater::plotReducedDim() function to plot it with some nice +defaults easily. One nice thing about this function is that it uses +ggplot2 under the hood, so if we wanted to customize it +later, we could.

+ + + +
# plot PCA results
+plotReducedDim(normalized_sce, "PCA")
+ + +

+ + + +

PCA gives us a matrix with more than just two dimensions, and we +might want to look at some higher dimensions too. We can do that with +the ncomponents argument.

+ + + +
# plot PC3 and PC4
+plotReducedDim(normalized_sce, "PCA", ncomponents = c(3,4))
+ + +

+ + + +
+
+
+

Modeling variance

+

The variation in gene expression we see among cells comes from a +combination of variation due to technical effects and the biology we +really care about. In order to roughly account for this we could just +take the largest variance genes, on the assumption that low variance +genes are mostly just noise. This is the default approach that +runPCA() and calculatePCA() take, using the +genes with the greatest variance across cells to calculate the PCA +matrix.

+

If we want to be a bit more careful about it, we can model the +variance in expression of each gene as a function of the mean expression +for that gene. This is useful because we generally expect the variance +to increase as mean expression increases, even if there is no biological +signal in the expression variation.

+

We will do this modeling of variance by expression with the +scran::modelGeneVar() function, saving the results to a new +variable.

+ + + +
gene_variance <- scran::modelGeneVar(normalized_sce)
+ + + +

Now let’s plot the relationship between gene expression and variance +we were discussing. Here we will also add the fitting curve that +scran::modelGeneVar() created, which is stored as function +in the $trend slot of the gene_variance +object. We can add a function like that curve to a ggplot +with a stat_function layer.

+ + + +
ggplot(as.data.frame(gene_variance), aes(x = mean, y = total)) +
+  geom_point(alpha = 0.1) +
+  stat_function(fun = metadata(gene_variance)$trend, color = "blue") +
+  labs(
+    x = "Mean log-expression",
+    y = "Variance") +
+  theme_bw()
+ + +

+ + + +

Now we can use scran::getTopHVGs() to select the genes +that have the most biological variation (according to the model) and +recalculate PCA scores using only those genes. (In practice, we are +selecting the genes with the largest residual variation after removing +technical variation modeled by the mean/variance relationship.)

+

Here we are picking the 2000 top genes to match the number of genes +from our earlier calculations.

+ + + +
# select the most variable genes
+highvar_genes <- scran::getTopHVGs(gene_variance, n = 2000)
+# calculate a PCA matrix using those genes
+normalized_sce <- runPCA(normalized_sce, subset_row = highvar_genes)
+ + + +

Now we can plot our new PCA values for comparison. You might realize +that our old PCA values were replaced when we ran runPCA() +again, so we can’t recreate the earlier plots at this stage of the +notebook. You will have to scroll up to your earlier plots to +compare.

+ + + +
# plot the new PCA results
+plotReducedDim(normalized_sce, "PCA")
+ + +

+ + +
plotReducedDim(normalized_sce, "PCA", ncomponents = c(3,4))
+ + +

+ + + +
+
+

UMAP

+

UMAP (Uniform Manifold Approximation and Projection) +is a machine learning technique designed to provide more detail in +highly dimensional data than a typical principal components analysis. +While PCA assumes that the variation we care about has a particular +distribution (normal, broadly speaking), UMAP allows more complicated +distributions that it learns from the data. The underlying mathematics +are beyond me, but if you are more ambitious than I, you can look at the +paper by McInnes, Healy, +& Melville (2018). The main advantage of this change in +underlying assumptions is that UMAP can do a better job separating +clusters, especially when some of those clusters may be more similar to +each other than others.

+

Another dimensionality reduction technique that you may have heard of +is t-SNE (t-distributed Stochastic Neighbor Embedding), +which has similar properties to UMAP, and often produces similar +results. There is some ongoing debate about which of these two +techniques is superior, and whether the differences are due to the +underlying algorithm or to implementation and parameter initialization +defaults. Regardless of why, in our experience, UMAP seems to produce +slightly better results and run a bit faster, but the differences can be +subtle.

+
+

Default parameters

+

For ease of use with this data, we will be using the +scater::calculateUMAP() and scater::runUMAP() +function to apply UMAP to our single cell data, but similar functions +the uwot package (notably uwot::umap()) can be +used to apply UMAP to any numerical matrix.

+

UMAP can be slow for a large data set with lots of parameters. It is +worth noting that the scater::calculateUMAP() +implementation actually does PCA first, and then runs UMAP on the top 50 +PCs. If we have already calculated PCA (as we have) we can tell it to +use those results with the dimred argument.

+

As with PCA, there are two functions we could use: +scater::calculateUMAP() will return a matrix of results, +with one row for each sample, and a column for each of the UMAP +dimensions returned. scater::runUMAP() performs the same +function, but returns the results in a SingleCellExperiment object.

+

Let’s see how it looks with the (mostly) default parameters:

+ + + +
# Run UMAP
+normalized_sce <- runUMAP(normalized_sce,
+                          dimred = "PCA") # use already stored PCA results
+ + + +

Now we can plot with the same plotReducedDim() function, +specifying we want to plot the UMAP results this time. We will also add +some color this time with the color_by argument, using the +number of genes detected in each cell to assign a hue.

+ + + +
# make a UMAP plot with `plotReducedDim()`
+plotReducedDim(normalized_sce, "UMAP", color_by = "detected")
+ + +

+ + + +

There is clearly a lot of structure in there, but is it meaningful? +Do the clusters we see differentiate cell types? How should we divide +them up?

+

We will come back to this question later!

+
+
+
+

UMAP experiments

+

Now that we have an idea of what a UMAP plot with the default +parameters looks like, let’s try experimenting with the +n_neighbors parameter. First, we should see what this +parameter is, and what the default value is. In the console, run +?scater::calculateUMAP to see what this (and other +parameters) are. For even more parameters, you can look at the +underlying implementation code that calculateUMAP() uses, +which is the function uwot::umap()

+

In order to make our experimentation easier, we will create a +function that allows us to rerun the same code easily, but +create an argument that allows us to change one variable: the +n_neighbors variable. Here we are saving only a line of +code, but we could apply this to a much more complex series of +operations if we wanted to!

+ + + +
UMAP_plot_wrapper <- function(sce = normalized_sce, nn_param = 15) {
+  # Purpose: Run UMAP and plot the output
+  # Args: nn_param: a single numeric argument that will change the
+  #                 n_neighbors variable in the calculateUMAP() function.
+  # Output: a scatterplot with the two UMAP coordinates plotted and
+  #         cell-types labeled with data point colors.
+
+  # Run UMAP with a specified n_neighbors parameter
+  sce_umap <- scater::runUMAP(sce, dimred = "PCA", n_neighbors = nn_param)
+  scater::plotReducedDim(sce_umap, "UMAP", color_by = "detected") +
+    # make the legend label more informative (this is ggplot2 code!)
+    guides(color = guide_colorbar(title="genes\nexpressed"))
+}
+ + + +

Let’s make sure that works and gives the same result as before when +we use the default parameters.

+ + + +
UMAP_plot_wrapper(nn_param = 15)
+ + +

+ + + +

Kind of?

+

This isn’t your fault! UMAP is a non-deterministic function, which +means that there is a random component to the results. We can use +set.seed() to be sure that an individual run (or set of +runs) is the same every time you run your analysis, but it is important +to check your results a few times with different random starting points +to be sure that the random component is not giving you anomalous +results. Setting a different random number seed with +set.seed() is one way to do this, or you can run the +analysis multiple times in the same session, as we have done here.

+

Fill in the next few code chunks with the function and the +n_neighbors argument you would like to use for each. (Feel +free to add more tests!) Then run the chunks and compare your output +graphs.

+ + + +
# Try something low?
+UMAP_plot_wrapper(nn_param = 3)
+ + +

+ + + + + + +
# Try something high?
+UMAP_plot_wrapper(nn_param = 100)
+ + +

+ + + + + + +
# Try whatever you like!
+UMAP_plot_wrapper(nn_param = 5)
+ + +

+ + + +
+

Some ‘big picture’ thoughts to take from this experiment:

+
    +
  1. Analyses such as UMAP have various limitations for +interpretability. The coordinates of UMAP output for any given cell can +change dramatically depending on parameters, and even run to run with +the same parameters. This probably means that you shouldn’t rely on the +exact values of UMAP’s output.

    +
      +
    • One particular limitation of UMAP (and t-SNE) is that while observed +clusters have some meaning, the distance between clusters +usually does not (nor does cluster density). The fact that two clusters +are near each other should NOT be interpreted to mean that they are more +related to each other than to more distant clusters. (There is some +disagreement about whether UMAP distances have more meaning, but it is +probably safer to assume they don’t.)
    • +
  2. +
  3. Playing with parameters so you can fine-tune them is a good way +to give you more information about a particular analysis as well as the +data itself.

  4. +
  5. Where results are consistent, they are more likely to have +meaning. While we do not have labeled cell types in this case, there +does seem to be some consistency of the overall patterns that we see (if +not precise values), and this likely reflects biological information (or +technical artifacts).

  6. +
+

In summary, if the results of an analysis can be completely changed +by changing its parameters, you should be more cautious when it comes to +the conclusions you draw from it as well as having good rationale for +the parameters you choose.

+
+
+
+

t-SNE comparison

+

In the block below is a similar analysis and plot with t-SNE +(t-distributed Stochastic Neighbor Embedding). Note that this analysis +also uses PCA before moving on to the fancy machine learning.

+ + + +
# Run TSNE
+normalized_sce <- runTSNE(normalized_sce, dimred = "PCA")
+
+# plot with scater function
+plotReducedDim(normalized_sce, "TSNE", color_by = "detected")
+ + +

+ + + +

Different! (Slower!) Is it better or worse? Hard to say! Different +people like different things, and one plot might illustrate a particular +point better than another.

+
+
+
+

Save results

+

We are going to use this data more in the next notebook, so let’s +save it as an RDS file.

+ + + +
readr::write_rds(normalized_sce, file = output_sce_file)
+ + + +
+

Some further reading on dimension reduction:

+
    +
  • This website explains PCA +visually.
  • +
  • Becht et +al. (2018) discusses using UMAP for single-cell +data.
  • +
  • Wattenberg et +al. (2016) discuss how to use t-SNE properly with great +visuals. (The lessons apply to UMAP as well, with a broad substitution +of the n_neighbors parameter for +perplexity.)
  • +
  • Nguyen +& Holmes (2019) lay out guidelines on choosing dimensions +reduction methods.
  • +
  • Freitag (2019) is a +nice explanation and comparison of many different dimensionality +reduction techniques that you may encounter.
  • +
+
+
+
+

Session Info

+ + + +
sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats4    stats     graphics  grDevices utils     datasets  methods  
+[8] base     
+
+other attached packages:
+ [1] scran_1.32.0                scater_1.32.0              
+ [3] scuttle_1.14.0              SingleCellExperiment_1.26.0
+ [5] SummarizedExperiment_1.34.0 Biobase_2.64.0             
+ [7] GenomicRanges_1.56.0        GenomeInfoDb_1.40.0        
+ [9] IRanges_2.38.0              S4Vectors_0.42.0           
+[11] BiocGenerics_0.50.0         MatrixGenerics_1.16.0      
+[13] matrixStats_1.3.0           ggplot2_3.5.1              
+[15] optparse_1.7.5             
+
+loaded via a namespace (and not attached):
+  [1] gridExtra_2.3             rlang_1.1.3              
+  [3] magrittr_2.0.3            compiler_4.4.0           
+  [5] DelayedMatrixStats_1.26.0 vctrs_0.6.5              
+  [7] stringr_1.5.1             pkgconfig_2.0.3          
+  [9] crayon_1.5.2              fastmap_1.1.1            
+ [11] XVector_0.44.0            labeling_0.4.3           
+ [13] utf8_1.2.4                rmarkdown_2.26           
+ [15] tzdb_0.4.0                UCSC.utils_1.0.0         
+ [17] ggbeeswarm_0.7.2          bit_4.0.5                
+ [19] xfun_0.43                 bluster_1.14.0           
+ [21] zlibbioc_1.50.0           cachem_1.0.8             
+ [23] beachmat_2.20.0           jsonlite_1.8.8           
+ [25] highr_0.10                rhdf5filters_1.16.0      
+ [27] DelayedArray_0.30.0       Rhdf5lib_1.26.0          
+ [29] BiocParallel_1.38.0       irlba_2.3.5.1            
+ [31] parallel_4.4.0            cluster_2.1.6            
+ [33] R6_2.5.1                  bslib_0.7.0              
+ [35] stringi_1.8.3             limma_3.60.0             
+ [37] jquerylib_0.1.4           Rcpp_1.0.12              
+ [39] knitr_1.46                R.utils_2.12.3           
+ [41] FNN_1.1.4                 readr_2.1.5              
+ [43] Matrix_1.7-0              igraph_2.0.3             
+ [45] tidyselect_1.2.1          abind_1.4-5              
+ [47] yaml_2.3.8                viridis_0.6.5            
+ [49] codetools_0.2-20          lattice_0.22-6           
+ [51] tibble_3.2.1              withr_3.0.0              
+ [53] Rtsne_0.17                evaluate_0.23            
+ [55] getopt_1.20.4             pillar_1.9.0             
+ [57] generics_0.1.3            vroom_1.6.5              
+ [59] hms_1.1.3                 sparseMatrixStats_1.16.0 
+ [61] munsell_0.5.1             scales_1.3.0             
+ [63] glue_1.7.0                metapod_1.12.0           
+ [65] tools_4.4.0               BiocNeighbors_1.22.0     
+ [67] ScaledMatrix_1.12.0       locfit_1.5-9.9           
+ [69] fs_1.6.4                  cowplot_1.1.3            
+ [71] rhdf5_2.48.0              grid_4.4.0               
+ [73] DropletUtils_1.24.0       edgeR_4.2.0              
+ [75] colorspace_2.1-0          GenomeInfoDbData_1.2.12  
+ [77] beeswarm_0.4.0            BiocSingular_1.20.0      
+ [79] HDF5Array_1.32.0          vipor_0.4.7              
+ [81] cli_3.6.2                 rsvd_1.0.5               
+ [83] fansi_1.0.6               S4Arrays_1.4.0           
+ [85] viridisLite_0.4.2         dplyr_1.1.4              
+ [87] uwot_0.2.2                gtable_0.3.5             
+ [89] R.methodsS3_1.8.2         sass_0.4.9               
+ [91] digest_0.6.35             SparseArray_1.4.0        
+ [93] ggrepel_0.9.5             dqrng_0.3.2              
+ [95] farver_2.1.1              htmltools_0.5.8.1        
+ [97] R.oo_1.26.0               lifecycle_1.0.4          
+ [99] httr_1.4.7                statmod_1.5.0            
+[101] bit64_4.0.5              
+ + +
+ +
LS0tCnRpdGxlOiAiRGltZW5zaW9uIFJlZHVjdGlvbiB3aXRoIHNjUk5BLXNlcSBkYXRhIgphdXRob3I6IENDREwgZm9yIEFMU0YKZGF0ZTogMjAyMQpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazoKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OiB0cnVlCi0tLQoKIyMgT2JqZWN0aXZlcwoKVGhpcyBub3RlYm9vayB3aWxsIGRlbW9uc3RyYXRlIGhvdyB0bzoKCi0gUmVhZCBDZWxsIFJhbmdlciBkYXRhIGludG8gUgotIEZpbHRlciBwb3N0LXF1YW50aWZpY2F0aW9uIGNlbGxzIHVzaW5nIGBlbXB0eURyb3BzQ2VsbFJhbmdlcigpYAotIEFwcGx5IGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiBtZXRob2RzIHRvIHNpbmdsZSBjZWxsIGRhdGEKLSBWaXN1YWxpemUgc2FtcGxlcyBpbiByZWR1Y2VkIGRpbWVuc2lvbmFsIHNwYWNlCgotLS0KCkluIHRoaXMgbm90ZWJvb2ssIHdlJ2xsIHRyeSBvdXQgc29tZSBkaW1lbnNpb24gcmVkdWN0aW9uIHRlY2huaXF1ZXMgb24gc2luZ2xlLWNlbGwgUk5BLXNlcSBkYXRhLgoKVmlzdWFsaXppbmcgaGlnaGx5IGRpbWVuc2lvbmFsIGRhdGEgaXMgYSBjb21tb24gY2hhbGxlbmdlIGluIGdlbm9taWNzLCBhbmQgZXNwZWNpYWxseSB3aXRoIFJOQS1zZXEgZGF0YS4KVGhlIGV4cHJlc3Npb24gb2YgZXZlcnkgZ2VuZSB3ZSBsb29rIGF0IGlzIGFub3RoZXIgZGltZW5zaW9uIGRlc2NyaWJpbmcgYSBzYW1wbGUuCldoZW4gd2UgYWxzbyBoYXZlIGh1bmRyZWRzIG9yIHRob3VzYW5kcyBvZiBpbmRpdmlkdWFsIHNhbXBsZXMsIGFzIGluIHRoZSBjYXNlIG9mIHNpbmdsZS1jZWxsIGFuYWx5c2lzLCBmaWd1cmluZyBvdXQgaG93IHRvIGNsZWFybHkgZGlzcGxheSBhbGwgb2YgdGhlIGRhdGEgaW4gYSBtZWFuaW5nZnVsIHdheSBpcyBkaWZmaWN1bHQuCgpBIGNvbW1vbiBwcmFjdGljZSBpcyB0byBjb21tb24gdG8gdXNlIGRpbWVuc2lvbiByZWR1Y3Rpb24gdGVjaG5pcXVlcyBzbyBhbGwgb2YgdGhlIGRhdGEgaXMgaW4gYSBtb3JlIG1hbmFnZWFibGUgZm9ybSBmb3IgcGxvdHRpbmcsIGNsdXN0ZXJpbmcsIGFuZCBvdGhlciBkb3duc3RyZWFtIGFuYWx5c2VzLgoKIyMgU2V0IFVwCgpgYGB7ciBzZXR1cH0KIyBMb2FkIGxpYnJhcmllcwpsaWJyYXJ5KGdncGxvdDIpCmxpYnJhcnkoc2NhdGVyKQpsaWJyYXJ5KHNjcmFuKQoKIyBTZXR0aW5nIHRoZSBzZWVkIGZvciByZXByb2R1Y2liaWxpdHkKc2V0LnNlZWQoMTIzNDUpCmBgYAoKIyMjIERpcmVjdG9yaWVzIGFuZCBmaWxlcwoKVGhlIGRhdGEgd2Ugd2lsbCBiZSB1c2luZyBmb3IgdGhpcyBtb2R1bGUgY29tZXMgZnJvbSBhIGEgMTB4IEdlbm9taWNzIGRhdGEgc2V0IG9mIFtleHByZXNzaW9uIGRhdGEgZnJvbSBhIEhvZGdraW4ncyBMeW1waG9tYSB0dW1vcl0oaHR0cHM6Ly9zdXBwb3J0LjEweGdlbm9taWNzLmNvbS9zaW5nbGUtY2VsbC1nZW5lLWV4cHJlc3Npb24vZGF0YXNldHMvNC4wLjAvUGFyZW50X05HU0MzX0RJX0hvZGdraW5zTHltcGhvbWEpLgpUaGUgZGF0YSB3YXMgZ2VuZXJhdGVkIHdpdGggdGhlIDEwWHYzLjEgY2hlbWlzdHJ5LCBhbmQgcHJvY2Vzc2VkIHdpdGggQ2VsbCBSYW5nZXIgYW5kIDEweCBHZW5vbWljcyBzdGFuZGFyZCBwaXBlbGluZS4KCgpUaGVyZSBhcmUgYSB2YXJpZXR5IG9mIGZpbGVzIHRoYXQgeW91IHdpbGwgb2Z0ZW4gc2VlIGFzIHBhcnQgb2YgdGhlIHN0YW5kYXJkIG91dHB1dCBmcm9tIENlbGwgUmFuZ2VyLCB3aGljaCBhcmUgZGVzY3JpYmVkIGluIGRldGFpbCBpbiBbMTB4IEdlbm9taWNzIGRvY3VtZW50YXRpb25dKGh0dHBzOi8vc3VwcG9ydC4xMHhnZW5vbWljcy5jb20vc2luZ2xlLWNlbGwtZ2VuZS1leHByZXNzaW9uL3NvZnR3YXJlL3BpcGVsaW5lcy9sYXRlc3Qvb3V0cHV0L292ZXJ2aWV3KS4KV2UgaGF2ZSBpbmNsdWRlZCBzb21lIG9mIHRoZXNlIGluIHRoZSBgZGF0YS9ob2RraW5zL2NlbGxyYW5nZXJgIGRpcmVjdG9yeSwgaW5jbHVkaW5nIHRoZSBgd2ViX3N1bW1hcnkuaHRtbGAgZmlsZSB0aGF0IGluY2x1ZGVzIHNvbWUgc2ltaWxhciBRQyBzdGF0aXN0aWNzIHRvIHRob3NlIHdlIGdlbmVyYXRlZCB3aXRoIGBhbGV2aW5RQ2AuClRoZSBtYWluIGZpbGUgd2Ugd2lsbCBiZSB3b3JraW5nIHdpdGggYXJlIHRoZSBmZWF0dXJlIGJ5IGJhcmNvZGUgbWF0cmljZXMuCkNlbGwgUmFuZ2VyIGRvZXMgc29tZSBmaWx0ZXJpbmcgb24gaXRzIG93biwgYnV0IHdlIHdpbGwgc3RhcnQgd2l0aCB0aGUgcmF3IGRhdGEuCgpgYGB7ciBmaWxlcGF0aHN9CiMgbWFpbiBkYXRhIGRpcmVjdG9yeQpkYXRhX2RpciA8LSBmaWxlLnBhdGgoImRhdGEiLCAiaG9kZ2tpbnMiKQojIHJlZmVyZW5jZSBmaWxlcwpyZWZfZGlyIDwtIGZpbGUucGF0aCgiZGF0YSIsICJyZWZlcmVuY2UiKQoKIyBQYXRoIHRvIHRoZSBDZWxsIFJhbmdlciBtYXRyaXgKcmF3X21hdHJpeF9kaXIgPC0gZmlsZS5wYXRoKGRhdGFfZGlyLCAiY2VsbHJhbmdlciIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAicmF3X2ZlYXR1cmVfYmNfbWF0cml4IikKCiMgUGF0aCB0byBtaXRvY2hvbmRyaWFsIGdlbmVzIHRhYmxlCm1pdG9fZmlsZSA8LSBmaWxlLnBhdGgocmVmX2RpciwgImhzX21pdG9jaG9uZHJpYWxfZ2VuZXMudHN2IikKCiMgRGlyZWN0b3J5IGFuZCBmaWxlIHRvIHNhdmUgb3V0cHV0Cm5vcm1hbGl6ZWRfZGlyIDwtIGZpbGUucGF0aChkYXRhX2RpciwgIm5vcm1hbGl6ZWQiKQpmczo6ZGlyX2NyZWF0ZShub3JtYWxpemVkX2RpcikKCm91dHB1dF9zY2VfZmlsZSA8LSBmaWxlLnBhdGgobm9ybWFsaXplZF9kaXIsICJub3JtYWxpemVkX2hvZGdraW5zX3NjZS5yZHMiKQoKYGBgCgojIyBSZWFkaW5nIENlbGwgUmFuZ2VyIGRhdGEKCkNlbGwgUmFuZ2VyIG91dHB1dCBpbmNsdWRlcyBjb3VudCBkYXRhIGluIHR3byBtYWluIGZvcm1hdHMuClRoZSBmaXJzdCBpcyBhIGZvbGRlciB3aXRoIGEgZmVhdHVyZSBsaXN0LCBhIGJhcmNvZGUgbGlzdCwgYW5kIGEgc3BhcnNlIG1hdHJpeCBpbiBbIk1hdHJpeCBFeGNoYW5nZSIgZm9ybWF0XShodHRwczovL21hdGgubmlzdC5nb3YvTWF0cml4TWFya2V0L2Zvcm1hdHMuaHRtbCkuClRoZSBgRHJvcGxldFV0aWxzOjpyZWFkMTB4Q291bnRzKClgIGZ1bmN0aW9uIHRha2VzIHRoaXMgZGlyZWN0b3J5IGFuZCByZWFkcyBpbiB0aGUgZGF0YSBmcm9tIHRoZXNlIHRocmVlIGZpbGVzLCBhc3NlbWJsaW5nIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCB3ZSBoYXZlIHdvcmtlZCB3aXRoIGJlZm9yZS4KCkFsdGVybmF0aXZlbHksIHdlIGNvdWxkIHVzZSB0aGUgYEhERjVgIGZvcm1hdCBmaWxlIHRoYXQgQ2VsbCBSYW5nZXIgb3V0cHV0cyBhcyBhIGZpbGUgd2l0aCB0aGUgYC5oNWAgZXh0ZW5zaW9uLCB3aGljaCBjb250YWlucyB0aGUgc2FtZSBkYXRhLgpGb3Igd2hhdGV2ZXIgcmVhc29uLCB0aGUgd2F5IHlvdSByZWFkIHRoZSBkYXRhIGFmZmVjdHMgaG93IGl0IGlzIHN0b3JlZCBpbiBSLgpSZWFkaW5nIGZyb20gdGhlIGRpcmVjdG9yeSByZXN1bHRzIGluIHNtYWxsZXIgb2JqZWN0cyBpbiBSLCBzbyB0aGF0IGlzIHdoYXQgd2Ugd2lsbCBkbyBoZXJlLgoKQ2VsbCBSYW5nZXIgYWxzbyBvdXRwdXRzIGJvdGggZmlsdGVyZWQgYW5kIHJhdyBtYXRyaWNlczsgdG9kYXkgd2Ugd2lsbCBzdGFydCB3aXRoIHRoZSByYXcgbWF0cml4IGFuZCBwZXJmb3JtIG91ciBvd24gZmlsdGVyaW5nLgoKIVtSb2FkbWFwOiBQcmVwcm9jZXNzaW5nIGFuZCBJbXBvcnRdKGRpYWdyYW1zL3JvYWRtYXBfc2luZ2xlX3ByZXByb2Nlc3NfYWxldmluLnBuZykKCmBgYHtyIHJlYWQxMHgsIGxpdmUgPSBUUlVFfQpob2Rna2luc19zY2UgPC0gRHJvcGxldFV0aWxzOjpyZWFkMTB4Q291bnRzKHJhd19tYXRyaXhfZGlyKQpgYGAKCkhvdyBtYW55IHBvdGVudGlhbCBjZWxscyBhcmUgdGhlcmUgaGVyZT8KCmBgYHtyIGNlbGxjb3VudCwgbGl2ZSA9IFRSVUV9CmRpbShob2Rna2luc19zY2UpCmBgYAoKVGhhdCBpcyBhIGxvdCBvZiBjZWxscyEKSW4gZmFjdCwgaXQgaXMgcmVhbGx5IGV2ZXJ5IHBvc3NpYmxlIGJhcmNvZGUsIHdoZXRoZXIgdGhlcmUgd2VyZSByZWFkcyBhc3NvY2lhdGVkIHdpdGggaXQgb3Igbm90LgpXZSBzaG91bGQgcHJvYmFibHkgZG8gc29tZXRoaW5nIGFib3V0IHRoYXQuCgoKIyMgUUMgYW5kIG5vcm1hbGl6YXRpb24KCiFbUm9hZG1hcDogUUMgYW5kIGZpbHRlcmluZ10oZGlhZ3JhbXMvcm9hZG1hcF9zaW5nbGVfcWNfbm9ybS5wbmcpCgojIyMgQmFzaWMgUUMgc3RhdHMKCldlIHdpbGwgc3RhcnQgYnkgY2FsY3VsYXRpbmcgdGhlIGJhc2ljIFFDIHN0YXRzIGFzIHdlIGhhdmUgZG9uZSBwcmV2aW91c2x5LCBhZGRpbmcgdGhvc2UgdG8gb3VyIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0LgoKVGhlIGZpcnN0IHN0ZXAgYWdhaW4gaXMgcmVhZGluZyBpbiBvdXIgdGFibGUgb2YgbWl0b2Nob25kcmlhbCBnZW5lcyBhbmQgZmluZGluZyB0aGUgb25lcyB0aGF0IHdlcmUgcXVhbnRpZmllZCBvdXIgZGF0YSBzZXQuCgpgYGB7ciBtaXRvZ2VuZXN9Cm1pdG9fZ2VuZXMgPC0gcmVhZHI6OnJlYWRfdHN2KG1pdG9fZmlsZSkgfD4KICBkcGx5cjo6ZmlsdGVyKGdlbmVfaWQgJWluJSByb3duYW1lcyhob2Rna2luc19zY2UpKSB8PgogIGRwbHlyOjpwdWxsKGdlbmVfaWQpCmBgYAoKTmV4dCB3ZSB3aWxsIGNhbGN1bGF0ZSB0aGUgUUMgc3RhdHMgdGhhdCB3ZSB1c2VkIGJlZm9yZS4KTm90ZSB0aGF0IHRoaXMgaXMgbXVjaCBzbG93ZXIgdGhhbiBiZWZvcmUsIGFzIHdlIGhhdmUgbWFueSBtb3JlIGdlbmVzIGluIHRoZSB1bmZpbHRlcmVkIHNldCEKCmBgYHtyIGNhbGN1bGF0ZVFDfQpob2Rna2luc19zY2UgPC0gc2NhdGVyOjphZGRQZXJDZWxsUUMoCiAgaG9kZ2tpbnNfc2NlLAogIHN1YnNldHMgPSBsaXN0KG1pdG8gPSBtaXRvX2dlbmVzKSkKYGBgCgpXZSBjYW4gbm93IGRvIHRoZSBtb3N0IGJhc2ljIGxldmVsIG9mIGZpbHRlcmluZzogZ2V0dGluZyByaWQgb2YgImNlbGxzIiB3aXRoIG5vIHJlYWRzLgoKYGBge3IgcmVtb3ZlX3plcm8sIGxpdmUgPSBUUlVFfQpob2Rna2luc19zY2UgPC0gaG9kZ2tpbnNfc2NlWywgaG9kZ2tpbnNfc2NlJHRvdGFsID4gMF0KZGltKGhvZGdraW5zX3NjZSkKYGBgCgojIyMgRmlsdGVyaW5nIHdpdGggYGVtcHR5RHJvcHNDZWxsUmFuZ2VyKClgCgpUaGUgYERyb3BsZXRVdGlsc2AgcGFja2FnZSB0aGF0IHdlIHVzZWQgdG8gcmVhZCBpbiB0aGUgMTB4IGRhdGEgaGFzIGEgbnVtYmVyIG9mIG90aGVyIHVzZWZ1bCBmZWF0dXJlcy4KT25lIGlzIHRoZSBgZW1wdHlEcm9wc0NlbGxSYW5nZXIoKWAgZnVuY3Rpb24sIHdoaWNoIHVzZXMgdGhlIG92ZXJhbGwgZ2VuZSBleHByZXNzaW9uIHBhdHRlcm5zIGluIHRoZSBzYW1wbGUgdG8gaWRlbnRpZnkgZHJvcGxldHMgdGhhdCBhcmUgbGlrZWx5IHRvIG5vdCBjb250YWluIGFuIGludGFjdCBjZWxsLCBidXQgbWF5IHNpbXBseSBoYXZlIGNvbnRhaW5lZCBsb29zZSBhbWJpZW50IFJOQSByZWxlYXNlZCBkdXJpbmcgY2VsbCBzZXBhcmF0aW9uLgpUaGlzIG1ldGhvZCB3YXMgb3JpZ2luYWxseSBkZXZlbG9wZWQgYnkgW0x1biAqZXQgYWwuKiAoMjAxOSldKGh0dHBzOi8vZG9pLm9yZy8xMC4xMTg2L3MxMzA1OS0wMTktMTY2Mi15KSBhbmQgaW1wbGVtZW50ZWQgYXMgdGhlIGZ1bmN0aW9uIGBlbXB0eURyb3BzKClgLCBidXQgaGFzIHNpbmNlIGJlZW4gYWRhcHRlZCBhcyB0aGUgbWFpbiBmaWx0ZXJpbmcgbWV0aG9kIHVzZWQgYnkgQ2VsbCBSYW5nZXIuIApUaGUgYGVtcHR5RHJvcHNDZWxsUmFuZ2VyKClgIGZ1bmN0aW9uIGVtdWxhdGVzIHRoZSB2YXJpYW50IG9mIHRoaXMgZnVuY3Rpb24gdGhhdCB1c2VkIGJ5IENlbGwgUmFuZ2VyLCBtYWtpbmcgdGhlIHJlc3VsdHMgbW9yZSBjb21wYXJhYmxlIGJldHdlZW4gdGhlIHR3byBtZXRob2RzLgoKVGhlIEVtcHR5IERyb3BzIG1ldGhvZCB1c2VzIHRoZSBkcm9wbGV0cyB3aXRoIHZlcnkgbG93IFVNSSBjb3VudHMgdG8gZXN0aW1hdGUgdGhlICJhbWJpZW50IiBleHByZXNzaW9uIHBhdHRlcm4gb2YgUk5BIGNvbnRlbnQgb3V0c2lkZSBvZiBjZWxscy4KSXQgdGhlbiBzY29yZXMgdGhlIHJlbWFpbmluZyBjZWxscyBiYXNlZCBob3cgbXVjaCB0aGV5IGRldmlhdGUgZnJvbSB0aGF0IHBhdHRlcm4sIGFzc2lnbmluZyBhIHNtYWxsIFAgdmFsdWUgd2hlbiB0aGUgZHJvcGxldCdzIGV4cHJlc3Npb24gZGV2aWF0ZXMgZnJvbSB0aGUgYW1iaWVudCBleHByZXNzaW9uIHBhdHRlcm4uCkJlY2F1c2UgaXQgdXNlcyB0aGUgbG93IFVNSSBjb3VudCBkcm9wbGV0cywgdGhpcyBtZXRob2Qgc2hvdWxkIG5vdCBiZSB1c2VkIHdoZW4gb3RoZXIgZmlsdGVyaW5nIGhhcyBhbHJlYWR5IGJlZW4gcGVyZm9ybWVkICh3aGljaCBpcyB1bmZvcnR1bmF0ZWx5IHRoZSBjYXNlIHdpdGggdGhlIHZlcnNpb24gb2YgYHNhbG1vbiBhbGV2aW5gIHdlIHVzZWQpLgoKVGhpcyBtZXRob2Qgc2VlbXMgdG8gcGVyZm9ybSB3ZWxsIHRvIGV4Y2x1ZGUgZmFsc2UgImNlbGxzIiB3aGlsZSByZXRhaW5pbmcgY2VsbHMgd2l0aCBkaXN0aW5jdCBleHByZXNzaW9uIHByb2ZpbGVzIGJ1dCBsb3cgY291bnRzIHRoYXQgbWlnaHQgaGF2ZSBmYWlsZWQgYSBzaW1wbGUgY3V0b2ZmLgpOb3RlIHRoYXQgdGhpcyBtZXRob2QgYWxzbyByZXF1aXJlcyB0aGF0IHRoZSBkYXRhIGhhcyBhbHJlYWR5IGJlZW4gcXVhbnRpZmllZCwgd2l0aCByZWFkcyBhc3NpZ25lZCB0byBnZW5lcywgYXMgY29tcGFyZWQgdG8gYSBzaW1wbGUgdG90YWwgVU1JIGNvdW50IGZpbHRlciB3aGljaCBjYW4gYmUgcGVyZm9ybWVkIG11Y2ggZWFybGllciBpbiB0aGUgcGlwZWxpbmUuCgpUaGUgYGVtcHR5RHJvcHNDZWxsUmFuZ2VyKClgIGZ1bmN0aW9uIHRha2VzIHRoZSBgY291bnRzYCBtYXRyaXggZnJvbSBvdXIgU2luZ2xlQ2VsbEV4cGVyaW1lbnQsIGFuZCByZXR1cm5zIGEgZGF0YSBmcmFtZSB3aXRoIHRoZSBzdGF0aXN0aWNzIGl0IGNhbGN1bGF0ZXMuClRoaXMgd2lsbCB0YWtlIGEgZmV3IG1pbnV0ZXMgdG8gcnVuLCBidXQgd2UgY2FuIHNwZWVkIGl0IHVwIGJ5IGFsbG93aW5nIHBhcmFsbGVsIHByb2Nlc3NpbmcuCgpgYGB7ciBlbXB0eWRyb3BzLCBsaXZlID0gVFJVRX0KZHJvcGxldF9zdGF0cyA8LSBEcm9wbGV0VXRpbHM6OmVtcHR5RHJvcHNDZWxsUmFuZ2VyKAogIGNvdW50cyhob2Rna2luc19zY2UpLAogIEJQUEFSQU0gPSBCaW9jUGFyYWxsZWw6Ok11bHRpY29yZVBhcmFtKDQpKSAjIHVzZSBtdWx0aXByb2Nlc3NpbmcKYGBgCgpXZSB3aWxsIHVzZSBhIGZhbHNlIGRpc2NvdmVyeSByYXRlIChGRFIpIG9mIDAuMDEgYXMgb3VyIGN1dG9mZiBmb3IgInJlYWwiIGNlbGxzLgpTaW5jZSBgZW1wdHlEcm9wc0NlbGxSYW5nZXIoKWAgdXNlcyBsb3cgY291bnQgY2VsbHMgdG8gZXN0aW1hdGUgdGhlICJhbWJpZW50IiBleHByZXNzaW9uIHBhdHRlcm4sIHRob3NlIGNlbGxzIGFyZSBub3QgYXNzaWduZWQgYW4gRkRSIHZhbHVlLCBhbmQgaGF2ZSBhIHZhbHVlIG9mIE5BLgpUaGVzZSBOQXMgY2FuIGJlIGEgcHJvYmxlbSBmb3IgZmlsdGVyaW5nIHdpdGggYSBCb29sZWFuIHZlY3RvciwgYXMgd2UgZGlkIGFib3ZlLCBzbyBpbnN0ZWFkIHdlIHdpbGwgdXNlIHRoZSBgd2hpY2goKWAgZnVuY3Rpb24gdG8gZ2V0IHRoZSAqcG9zaXRpb25zKiBvZiB0aGUgY2VsbHMgdGhhdCBwYXNzIG91ciBmaWx0ZXIgYW5kIHNlbGVjdCB0aGUgY29sdW1ucyB3ZSB3YW50IHVzaW5nIHRoYXQuCgpgYGB7ciBmaWx0ZXJfZW1wdHksIGxpdmUgPSBUUlVFfQpjZWxsc190b19yZXRhaW4gPC0gd2hpY2goZHJvcGxldF9zdGF0cyRGRFIgPD0gMC4wMSkKCmZpbHRlcmVkX3NjZSA8LSBob2Rna2luc19zY2VbLCBjZWxsc190b19yZXRhaW5dCmRpbShmaWx0ZXJlZF9zY2UpCmBgYAoKSG93IGRvZXMgdGhpcyBjb21wYXJlIHRvIHRoZSBudW1iZXIgb2YgY2VsbHMgaW4gdGhlIENlbGwgUmFuZ2VyIGZpbHRlcmVkIGRhdGE/Ckxvb2tpbmcgdGhlIGB3ZWJfc3VtbWFyeS5odG1sYCByZXBvcnQgZnJvbSBDZWxsIFJhbmdlciwgaXQgc2VlbXMgdGhhdCBpdCB3b3VsZCBoYXZlIGtlcHQgMywzOTQgY2VsbHMsIHNvIHdlIHNlZW0gdG8gYmUgZ2V0dGluZyBicm9hZGx5IHNpbWlsYXIgcmVzdWx0cy4KCiMjIyBDaGVja2luZyBtaXRvY2hvbmRyaWFsIGNvbnRlbnQKCldoaWxlIGBlbXB0eURyb3BzQ2VsbFJhbmdlcigpYCBzaG91bGQgaGF2ZSBmaWx0ZXJlZCBvdXQgZHJvcGxldHMgY29udGFpbmluZyBubyBjZWxscywgaXQgd2lsbCBub3QgbmVjZXNzYXJpbHkgZmlsdGVyIG91dCBkYW1hZ2VkIGNlbGxzLgpGb3IgdGhhdCB3ZSB3aWxsIHN0aWxsIHdhbnQgdG8gbG9vayBhdCBtaXRvY2hvbmRyaWFsIGNvbnRlbnQsIGFzIHdlIGRpZCBwcmV2aW91c2x5LgpUaGUgc3RhdGlzdGljcyB3ZSBjYWxjdWxhdGVkIGVhcmxpZXIgd2l0aCBgYWRkUGVyQ2VsbFFDKClgIGFyZSByZXRhaW5lZCBpbiBvdXIgbmV3IG9iamVjdCwgc28gd2UgY2FuIHBsb3QgdGhvc2UgZGlyZWN0bHkuCgpgYGB7ciBtaXRvX3BlcmNlbnRfcGxvdH0KIyBQbG90IHRoZSBtaXRvY2hvbmRyaWFsIHBlcmNlbnRzIHN0b3JlZCBpbiBgZmlsdGVyZWRfc2NlYApnZ3Bsb3QobWFwcGluZyA9IGFlcyh4ID0gZmlsdGVyZWRfc2NlJHN1YnNldHNfbWl0b19wZXJjZW50KSkgKwogIGdlb21faGlzdG9ncmFtKGJpbnMgPSAxMDApCmBgYApUaGVyZSBhcmUgY2VydGFpbmx5IHNvbWUgY2VsbHMgd2l0aCBoaWdoIG1pdG9jaG9uZHJpYWwgcGVyY2VudGFnZXMhCkZvciBub3csIHdlIHdpbGwgdXNlIGEgY3V0b2ZmIG9mIDIwJSB0byBmaWx0ZXIgb3V0IHRoZSB3b3JzdCBvZiB0aGUgY2VsbHMuCgpgYGB7ciBsaXZlID0gVFJVRX0KZmlsdGVyZWRfc2NlIDwtIGZpbHRlcmVkX3NjZVssIGZpbHRlcmVkX3NjZSRzdWJzZXRzX21pdG9fcGVyY2VudCA8IDIwXQpgYGAKCgpXZSBjYW4gYWxzbyBmaWx0ZXIgYnkgZmVhdHVyZXMgKGdlbmVzIGluIG91ciBjYXNlKSB1c2luZyBgc2NhdGVyOjphZGRQZXJGZWF0dXJlUUMoKWAgd2hpY2ggd2lsbCBjb21wdXRlIHRoZSBudW1iZXIgb2Ygc2FtcGxlcyB3aGVyZSBlYWNoIGdlbmUgaXMgZGV0ZWN0ZWQgYW5kIHRoZSBtZWFuIGNvdW50IGFjcm9zcyBhbGwgZ2VuZXMuCldlIGNhbiB0aGVuIHVzZSB0aG9zZSBkYXRhIChzdG9yZWQgaW4gYHJvd0RhdGFgKSB0byBmaWx0ZXIgYnkgcm93IHRvIG9ubHkgdGhlIGdlbmVzIHRoYXQgYXJlIGRldGVjdGVkIGluIGF0IGxlYXN0IDUlIG9mIGNlbGxzLCBhbmQgd2l0aCBhIG1lYW4gY291bnQgPiAwLjEuCgpgYGB7ciBnZW5lX3FjfQpmaWx0ZXJlZF9zY2UgPC0gc2NhdGVyOjphZGRQZXJGZWF0dXJlUUMoZmlsdGVyZWRfc2NlKQpkZXRlY3RlZCA8LSByb3dEYXRhKGZpbHRlcmVkX3NjZSkkZGV0ZWN0ZWQgPiA1CmV4cHJlc3NlZCA8LSByb3dEYXRhKGZpbHRlcmVkX3NjZSkkbWVhbiA+IDAuMQoKIyBmaWx0ZXIgdGhlIGdlbmVzIChyb3dzKSB0aGlzIHRpbWUKZmlsdGVyZWRfc2NlIDwtIGZpbHRlcmVkX3NjZVtkZXRlY3RlZCAmIGV4cHJlc3NlZCwgXQpgYGAKCkhvdyBtYW55IGNlbGxzIGRvIHdlIGhhdmUgbm93PwoKYGBge3IgZmlsdGVyZWRfZGltfQpkaW0oZmlsdGVyZWRfc2NlKQpgYGAKCgojIyMgTm9ybWFsaXplCgpOb3cgd2Ugd2lsbCBwZXJmb3JtIHRoZSBzYW1lIG5vcm1hbGl6YXRpb24gc3RlcHMgd2UgZGlkIGluIGEgcHJldmlvdXMgZGF0YXNldCwgdXNpbmcgYHNjcmFuOjpjb21wdXRlU3VtRmFjdG9ycygpYCBhbmQgYHNjYXRlcjo6bG9nTm9ybUNvdW50cygpYC4KWW91IG1pZ2h0IHJlY2FsbCB0aGF0IHRoZXJlIGlzIGEgYml0IG9mIHJhbmRvbW5lc3MgaW4gc29tZSBvZiB0aGVzZSBjYWxjdWxhdGlvbnMsIHNvIHdlIHNob3VsZCBiZSBzdXJlIHRvIGhhdmUgdXNlZCBgc2V0LnNlZWQoKWAgZWFybGllciBpbiB0aGUgbm90ZWJvb2sgZm9yIHJlcHJvZHVjaWJpbGl0eS4KCmBgYHtyIHN1bWZhY3RvcnN9CiMgQ2x1c3RlciBzaW1pbGFyIGNlbGxzCnFjbHVzdCA8LSBzY3Jhbjo6cXVpY2tDbHVzdGVyKGZpbHRlcmVkX3NjZSkKCiMgQ29tcHV0ZSBzdW0gZmFjdG9ycyBmb3IgZWFjaCBjZWxsIGNsdXN0ZXIgZ3JvdXBpbmcuCmZpbHRlcmVkX3NjZSA8LSBzY3Jhbjo6Y29tcHV0ZVN1bUZhY3RvcnMoZmlsdGVyZWRfc2NlLCBjbHVzdGVycyA9IHFjbHVzdCwgcG9zaXRpdmUgPSBGQUxTRSkKYGBgCgpJdCB0dXJucyBvdXQgaW4gdGhpcyBjYXNlIHdlIGVuZCB1cCB3aXRoIHNvbWUgbmVnYXRpdmUgc2l6ZSBmYWN0b3JzLgpUaGlzIGlzIHVzdWFsbHkgYW4gaW5kaWNhdGlvbiB0aGF0IG91ciBmaWx0ZXJpbmcgd2FzIG5vdCBzdHJpbmdlbnQgZW5vdWdoLCBhbmQgdGhlcmUgcmVtYWluIGEgbnVtYmVyIG9mIGNlbGxzIG9yIGdlbmVzIHdpdGggbmVhcmx5IHplcm8gY291bnRzLgpUaGlzIHByb2JhYmx5IGhhcHBlbmVkIHdoZW4gd2UgcmVtb3ZlZCB0aGUgaW5mcmVxdWVudGx5LWV4cHJlc3NlZCBnZW5lczsgY2VsbHMgd2hpY2ggaGFkIGhpZ2ggY291bnRzIGZyb20gdGhvc2UgcGFydGljdWxhciBnZW5lcyAoYW5kIGZldyBvdGhlcnMpIGNvdWxkIGhhdmUgaGFkIHRoZWlyIHRvdGFsIGNvdW50cyBkcmFtYXRpY2FsbHkgcmVkdWNlZC4KClRvIGFjY291bnQgZm9yIHRoaXMsIHdlIHdpbGwgcmVjYWxjdWxhdGUgdGhlIHBlci1jZWxsIHN0YXRzIGFuZCBmaWx0ZXIgb3V0IGxvdyBjb3VudHMuClVuZm9ydHVuYXRlbHksIHRvIGRvIHRoaXMsIHdlIG5lZWQgdG8gZmlyc3QgcmVtb3ZlIHRoZSBwcmV2aW91c2x5IGNhbGN1bGF0ZWQgc3RhdGlzdGljcywgd2hpY2ggd2Ugd2lsbCBkbyBieSBzZXR0aW5nIHRoZW0gdG8gYE5VTExgLgoKYGBge3IgcmVRQ30KIyByZW1vdmUgcHJldmlvdXMgY2FsY3VsYXRpb25zCmZpbHRlcmVkX3NjZSRzdW0gPC0gTlVMTApmaWx0ZXJlZF9zY2UkZGV0ZWN0ZWQgPC0gTlVMTApmaWx0ZXJlZF9zY2UkdG90YWwgPC0gTlVMTApmaWx0ZXJlZF9zY2Ukc3Vic2V0c19taXRvX3N1bSA8LSBOVUxMCmZpbHRlcmVkX3NjZSRzdWJzZXRzX21pdG9fZGV0ZWN0ZWQgPC0gTlVMTApmaWx0ZXJlZF9zY2Ukc3Vic2V0c19taXRvX3N1bSA8LSBOVUxMCgojIHJlY2FsY3VsYXRlIGNlbGwgc3RhdHMKZmlsdGVyZWRfc2NlIDwtIHNjYXRlcjo6YWRkUGVyQ2VsbFFDKGZpbHRlcmVkX3NjZSwgc3Vic2V0cyA9IGxpc3QobWl0byA9IG1pdG9fZ2VuZXMpKQoKIyBwcmludCB0aGUgbnVtYmVyIG9mIGNlbGxzIHdpdGggZmV3ZXIgdGhhbiA1MDAgVU1JcwpzdW0oZmlsdGVyZWRfc2NlJHN1bSA8IDUwMCkKYGBgCgpOb3cgd2UgY2FuIGZpbHRlciBhZ2Fpbi4KSW4gdGhpcyBjYXNlLCB3ZSB3aWxsIGtlZXAgY2VsbHMgd2l0aCBhdCBsZWFzdCA1MDAgVU1JcyBhZnRlciByZW1vdmluZyB0aGUgbG93bHkgZXhwcmVzc2VkIGdlbmVzLgpUaGVuIHdlIHdpbGwgcmVkbyB0aGUgc2l6ZSBmYWN0b3IgY2FsY3VsYXRpb24sIGhvcGVmdWxseSB3aXRoIG5vIG1vcmUgd2FybmluZ3MuCgoKYGBge3IgcmVmaWx0ZXJ9CmZpbHRlcmVkX3NjZSA8LSBmaWx0ZXJlZF9zY2VbLCBmaWx0ZXJlZF9zY2Ukc3VtID49IDUwMF0KCnFjbHVzdCA8LSBzY3Jhbjo6cXVpY2tDbHVzdGVyKGZpbHRlcmVkX3NjZSkKCmZpbHRlcmVkX3NjZSA8LSBzY3Jhbjo6Y29tcHV0ZVN1bUZhY3RvcnMoZmlsdGVyZWRfc2NlLCBjbHVzdGVycyA9IHFjbHVzdCkKYGBgCgpMb29rcyBnb29kISBOb3cgd2UnbGwgZG8gdGhlIG5vcm1hbGl6YXRpb24uCgpgYGB7ciBub3JtYWxpemV9CiMgTm9ybWFsaXplIGFuZCBsb2cgdHJhbnNmb3JtLgpub3JtYWxpemVkX3NjZSA8LSBzY2F0ZXI6OmxvZ05vcm1Db3VudHMoZmlsdGVyZWRfc2NlKQpgYGAKCkF0IHRoaXMgcG9pbnQsIHdlIGhhdmUgYSBmZXcgZGlmZmVyZW50IHZlcnNpb25zIG9mIG91ciBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdC4KVGhlIG9yaWdpbmFsIChtb3N0bHkpIHVuZmlsdGVyZWQgdmVyc2lvbiBpcyBpbiBgaG9kZ2tpbnNfc2NlYCwgdGhlIGZpbHRlcmVkIHZlcnNpb24gaW4gYGZpbHRlcmVkX3NjZWAsIGFuZCB0aGUgbm9ybWFsaXplZCB2ZXJzaW9uIGluIGBub3JtYWxpemVkX3NjZWAuCldlIGNhbiBjbGVhbiB0aG9zZSB1cCBhIGJpdCB0byBzYXZlIG1lbW9yeSwga2VlcGluZyBvbmx5IHRoZSBsYXRlc3QgYG5vcm1hbGl6ZWRfc2NlYCB2ZXJzaW9uLCB3aGljaCBub3cgaGFzIHR3byBgYXNzYXlgczoKYGNvdW50c2Agd2l0aCB0aGUgcmF3IGRhdGEgYW5kIGBsb2djb3VudHNgIHdpdGggdGhlIG5vcm1hbGl6ZWQgYW5kIHRyYW5zZm9ybWVkIGRhdGEuCgpgYGB7ciBjbGVhbl91cCwgbGl2ZSA9IFRSVUV9CmFzc2F5TmFtZXMobm9ybWFsaXplZF9zY2UpCnJtKGhvZGdraW5zX3NjZSwgZmlsdGVyZWRfc2NlKQpgYGAKCgojIyBEaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24gYW5kIGRpc3BsYXkKCiFbUm9hZG1hcDogRGltZW5zaW9uYWxpdHkgcmVkdWN0aW9uXShkaWFncmFtcy9yb2FkbWFwX3NpbmdsZV9kaW1lbnNpb25fcmVkdWN0aW9uLnBuZykKCiMjIyBQcmluY2lwYWwgQ29tcG9uZW50cyBBbmFseXNpcwoKUHJpbmNpcGFsIGNvbXBvbmVudCBhbmFseXNpcyAoUENBKSBpcyBhIGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiB0ZWNobmlxdWUgdGhhdCBhbGxvd3MgdXMgdG8gaWRlbnRpZnkgdGhlIGxhcmdlc3QgY29tcG9uZW50cyBvZiB2YXJpYXRpb24gaW4gYSBjb21wbGV4IGRhdGFzZXQuCk91ciBleHByZXNzaW9uIGRhdGEgY2FuIGJlIHRob3VnaHQgb2YgYXMgbWFwcGluZyBlYWNoIHNhbXBsZSBpbiBhIG11bHRpZGltZW5zaW9uYWwgc3BhY2UgZGVmaW5lZCBieSB0aGUgZXhwcmVzc2lvbiBsZXZlbCBvZiBlYWNoIGdlbmUuClRoZSBleHByZXNzaW9uIG9mIG1hbnkgb2YgdGhvc2UgZ2VuZXMgYXJlIGNvcnJlbGF0ZWQsIHNvIHdlIGNhbiBvZnRlbiBnZXQgYSBiZXR0ZXIsIHNpbXBsZXIgcGljdHVyZSBvZiB0aGUgZGF0YSBieSBjb21iaW5pbmcgdGhlIGluZm9ybWF0aW9uIGZyb20gdGhvc2UgY29ycmVsYXRlZCBnZW5lcy4KClBDQSByb3RhdGVzIGFuZCB0cmFuc2Zvcm1zIHRoaXMgc3BhY2Ugc28gdGhhdCBlYWNoIGF4aXMgaXMgbm93IGEgY29tYmluYXRpb24gb2YgbXVsdGlwbGUgY29ycmVsYXRlZCBnZW5lcywgb3JkZXJlZCBzbyB0aGUgZmlyc3QgYXhlcyBjYXB0dXJlIHRoZSBtb3N0IHZhcmlhdGlvbiBmcm9tIHRoZSBkYXRhLgpUaGVzZSBuZXcgYXhlcyBhcmUgdGhlICJwcmluY2lwYWwgY29tcG9uZW50cy4iCklmIHdlIGxvb2sgYXQgdGhlIGZpcnN0IGZldyBjb21wb25lbnRzLCB3ZSBjYW4gb2Z0ZW4gZ2V0IGEgbmljZSBvdmVydmlldyBvZiByZWxhdGlvbnNoaXBzIGFtb25nIHRoZSBzYW1wbGVzIGluIHRoZSBkYXRhLgoKIyMjIyBTdG9yaW5nIFBDQSByZXN1bHRzIHdpdGggdGhlIHJhdyBkYXRhCgpXZSB3aWxsIHN0b3JlIHRoZSBQQ0EgcmVzdWx0cyBpbiBvdXIgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBvYmplY3QsIGFzIHdlIHdpbGwgd2FudCB0byB1c2UgdGhlbSBsYXRlci4KVG8gZG8gdGhpcywgd2Ugd2lsbCB1c2UgdGhlIGBydW5QQ0EoKWAgZnVuY3Rpb24gZnJvbSBgc2NhdGVyYCwgd2hpY2ggcGVyZm9ybXMgdGhlIFBDQSBjYWxjdWxhdGlvbnMgYW5kIHJldHVybnMgYSBuZXcgb2JqZWN0IHdpdGggdGhlIHJlc3VsdHMgc3RvcmVkIGluIHRoZSBgcmVkdWNlZERpbWAgc2xvdC4KSWYgd2Ugd2FudGVkIHRvLCB3ZSBjb3VsZCBnZXQgdGhlIHJhdyByZXN1bHRzIGFzIGEgbWF0cml4IGluc3RlYWQgd2l0aCBgY2FsY3VsYXRlUENBKClgIGZ1bmN0aW9uLCBhcyB3ZSBkaWQgaW4gYSBwcmV2aW91cyBub3RlYm9vay4KCldlIHdpbGwgYWxzbyB1c2UgdGhlIGBudG9wYCBhcmd1bWVudCB0byBjYWxjdWxhdGUgdGhlIFBDQSB1c2luZyAyMDAwIGdlbmVzIHdpdGggdGhlIGhpZ2hlc3QgdmFyaWFuY2UuClRoZSBkZWZhdWx0IGlzIGBudG9wID0gNTAwYC4KCmBgYHtyIHJ1blBDQSwgbGl2ZSA9IFRSVUV9CiMgY2FsY3VsYXRlIFBDQSB1c2luZyB0aGUgdG9wIDIwMDAgZ2VuZXMKbm9ybWFsaXplZF9zY2UgPC0gcnVuUENBKG5vcm1hbGl6ZWRfc2NlLCBudG9wID0gMjAwMCkKYGBgCgpXZSBjYW4gc2VlIHdoYXQgcmVkdWNlZCBkaW1lbnNpb25hbGl0eSBtYXRyaWNlcyBhcmUgc3RvcmVkIGluIHRoZSBvYmplY3Qgd2l0aCB0aGUgYHJlZHVjZWREaW1OYW1lcygpYCBmdW5jdGlvbi4KCmBgYHtyIHJlZHVjZWRfZGltX25hbWVzLCBsaXZlID0gVFJVRX0KIyBwcmludCB0aGUgcmVkdWNlZCBkaW1lbnNpb25hbGl0eSB0YWJsZXMgYXZhaWxhYmxlCnJlZHVjZWREaW1OYW1lcyhub3JtYWxpemVkX3NjZSkKYGBgCgpUbyBleHRyYWN0IHRoZW0gYnkgbmFtZSwgd2UgdXNlIHRoZSBgcmVkdWNlZERpbSgpYCBmdW5jdGlvbiwgbXVjaCBsaWtlIHRoZSBgYXNzYXkoKWAgZnVuY3Rpb24gdG8gZXh0cmFjdCBvcmlnaW5hbCBkYXRhLgoKYGBge3IgZXh0cmFjdF9yZWR1Y2VkLCBsaXZlID0gVFJVRX0KIyBwcmludCB0aGUgdG9wIGNvcm5lciBvZiB0aGUgUENBIG1hdHJpeApyZWR1Y2VkRGltKG5vcm1hbGl6ZWRfc2NlLCAiUENBIilbMToxMCwgMTo1XQpgYGAKCiMjIyMgUGxvdHRpbmcgUENBIHJlc3VsdHMKCklmIHdlIGhhdmUgdGhlIFBDQSByZXN1bHRzIHN0b3JlZCBpbiB0aGUgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBvYmplY3QsIHdlIGNhbiB1c2UgdGhlIGBzY2F0ZXI6OnBsb3RSZWR1Y2VkRGltKClgIGZ1bmN0aW9uIHRvIHBsb3QgaXQgd2l0aCBzb21lIG5pY2UgZGVmYXVsdHMgZWFzaWx5LgpPbmUgbmljZSB0aGluZyBhYm91dCB0aGlzIGZ1bmN0aW9uIGlzIHRoYXQgaXQgdXNlcyBgZ2dwbG90MmAgdW5kZXIgdGhlIGhvb2QsIHNvIGlmIHdlIHdhbnRlZCB0byBjdXN0b21pemUgaXQgbGF0ZXIsIHdlIGNvdWxkLgoKYGBge3IgcGxvdFBDQSwgbGl2ZSA9IFRSVUV9CiMgcGxvdCBQQ0EgcmVzdWx0cwpwbG90UmVkdWNlZERpbShub3JtYWxpemVkX3NjZSwgIlBDQSIpCmBgYApQQ0EgZ2l2ZXMgdXMgYSBtYXRyaXggd2l0aCBtb3JlIHRoYW4ganVzdCB0d28gZGltZW5zaW9ucywgYW5kIHdlIG1pZ2h0IHdhbnQgdG8gbG9vayBhdCBzb21lIGhpZ2hlciBkaW1lbnNpb25zIHRvby4KV2UgY2FuIGRvIHRoYXQgd2l0aCB0aGUgYG5jb21wb25lbnRzYCBhcmd1bWVudC4KCmBgYHtyIHBsb3RQQ0EzNCwgbGl2ZSA9IFRSVUV9CiMgcGxvdCBQQzMgYW5kIFBDNApwbG90UmVkdWNlZERpbShub3JtYWxpemVkX3NjZSwgIlBDQSIsIG5jb21wb25lbnRzID0gYygzLDQpKQpgYGAKCiMjIyBNb2RlbGluZyB2YXJpYW5jZQoKVGhlIHZhcmlhdGlvbiBpbiBnZW5lIGV4cHJlc3Npb24gd2Ugc2VlIGFtb25nIGNlbGxzIGNvbWVzIGZyb20gYSBjb21iaW5hdGlvbiBvZiB2YXJpYXRpb24gZHVlIHRvIHRlY2huaWNhbCBlZmZlY3RzIGFuZCB0aGUgYmlvbG9neSB3ZSByZWFsbHkgY2FyZSBhYm91dC4KSW4gb3JkZXIgdG8gcm91Z2hseSBhY2NvdW50IGZvciB0aGlzIHdlIGNvdWxkIGp1c3QgdGFrZSB0aGUgbGFyZ2VzdCB2YXJpYW5jZSBnZW5lcywgb24gdGhlIGFzc3VtcHRpb24gdGhhdCBsb3cgdmFyaWFuY2UgZ2VuZXMgYXJlIG1vc3RseSBqdXN0IG5vaXNlLgpUaGlzIGlzIHRoZSBkZWZhdWx0IGFwcHJvYWNoIHRoYXQgYHJ1blBDQSgpYCBhbmQgYGNhbGN1bGF0ZVBDQSgpYCB0YWtlLCB1c2luZyB0aGUgZ2VuZXMgd2l0aCB0aGUgZ3JlYXRlc3QgdmFyaWFuY2UgYWNyb3NzIGNlbGxzIHRvIGNhbGN1bGF0ZSB0aGUgUENBIG1hdHJpeC4KCklmIHdlIHdhbnQgdG8gYmUgYSBiaXQgbW9yZSBjYXJlZnVsIGFib3V0IGl0LCB3ZSBjYW4gbW9kZWwgdGhlIHZhcmlhbmNlIGluIGV4cHJlc3Npb24gb2YgZWFjaCBnZW5lIGFzIGEgZnVuY3Rpb24gb2YgdGhlIG1lYW4gZXhwcmVzc2lvbiBmb3IgdGhhdCBnZW5lLgpUaGlzIGlzIHVzZWZ1bCBiZWNhdXNlIHdlIGdlbmVyYWxseSBleHBlY3QgdGhlIHZhcmlhbmNlIHRvIGluY3JlYXNlIGFzIG1lYW4gZXhwcmVzc2lvbiBpbmNyZWFzZXMsIGV2ZW4gaWYgdGhlcmUgaXMgbm8gYmlvbG9naWNhbCBzaWduYWwgaW4gdGhlIGV4cHJlc3Npb24gdmFyaWF0aW9uLgoKV2Ugd2lsbCBkbyB0aGlzIG1vZGVsaW5nIG9mIHZhcmlhbmNlIGJ5IGV4cHJlc3Npb24gd2l0aCB0aGUgYHNjcmFuOjptb2RlbEdlbmVWYXIoKWAgZnVuY3Rpb24sIHNhdmluZyB0aGUgcmVzdWx0cyB0byBhIG5ldyB2YXJpYWJsZS4KCmBgYHtyIG1vZGVsX3ZhcmlhbmNlfQpnZW5lX3ZhcmlhbmNlIDwtIHNjcmFuOjptb2RlbEdlbmVWYXIobm9ybWFsaXplZF9zY2UpCmBgYAoKTm93IGxldCdzIHBsb3QgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGdlbmUgZXhwcmVzc2lvbiBhbmQgdmFyaWFuY2Ugd2Ugd2VyZSBkaXNjdXNzaW5nLgpIZXJlIHdlIHdpbGwgYWxzbyBhZGQgdGhlIGZpdHRpbmcgY3VydmUgdGhhdCBgc2NyYW46Om1vZGVsR2VuZVZhcigpYCBjcmVhdGVkLCB3aGljaCBpcyBzdG9yZWQgYXMgZnVuY3Rpb24gaW4gdGhlICBgJHRyZW5kYCBzbG90IG9mIHRoZSBgZ2VuZV92YXJpYW5jZWAgb2JqZWN0LgpXZSBjYW4gYWRkIGEgZnVuY3Rpb24gbGlrZSB0aGF0IGN1cnZlIHRvIGEgYGdncGxvdGAgd2l0aCBhIGBzdGF0X2Z1bmN0aW9uYCBsYXllci4KCmBgYHtyIHBsb3RfdmFyaWFuY2V9CmdncGxvdChhcy5kYXRhLmZyYW1lKGdlbmVfdmFyaWFuY2UpLCBhZXMoeCA9IG1lYW4sIHkgPSB0b3RhbCkpICsKICBnZW9tX3BvaW50KGFscGhhID0gMC4xKSArCiAgc3RhdF9mdW5jdGlvbihmdW4gPSBtZXRhZGF0YShnZW5lX3ZhcmlhbmNlKSR0cmVuZCwgY29sb3IgPSAiYmx1ZSIpICsKICBsYWJzKAogICAgeCA9ICJNZWFuIGxvZy1leHByZXNzaW9uIiwKICAgIHkgPSAiVmFyaWFuY2UiKSArCiAgdGhlbWVfYncoKQpgYGAKCk5vdyB3ZSBjYW4gdXNlIGBzY3Jhbjo6Z2V0VG9wSFZHcygpYCB0byBzZWxlY3QgdGhlIGdlbmVzIHRoYXQgaGF2ZSB0aGUgbW9zdCBiaW9sb2dpY2FsIHZhcmlhdGlvbiAoYWNjb3JkaW5nIHRvIHRoZSBtb2RlbCkgYW5kIHJlY2FsY3VsYXRlIFBDQSBzY29yZXMgdXNpbmcgb25seSB0aG9zZSBnZW5lcy4KKEluIHByYWN0aWNlLCB3ZSBhcmUgc2VsZWN0aW5nIHRoZSBnZW5lcyB3aXRoIHRoZSBsYXJnZXN0IHJlc2lkdWFsIHZhcmlhdGlvbiBhZnRlciByZW1vdmluZyB0ZWNobmljYWwgdmFyaWF0aW9uIG1vZGVsZWQgYnkgdGhlIG1lYW4vdmFyaWFuY2UgcmVsYXRpb25zaGlwLikKCkhlcmUgd2UgYXJlIHBpY2tpbmcgdGhlIDIwMDAgdG9wIGdlbmVzIHRvIG1hdGNoIHRoZSBudW1iZXIgb2YgZ2VuZXMgZnJvbSBvdXIgZWFybGllciBjYWxjdWxhdGlvbnMuCgpgYGB7ciBnZXRfaGlnaHZhciwgbGl2ZSA9IFRSVUV9CiMgc2VsZWN0IHRoZSBtb3N0IHZhcmlhYmxlIGdlbmVzCmhpZ2h2YXJfZ2VuZXMgPC0gc2NyYW46OmdldFRvcEhWR3MoZ2VuZV92YXJpYW5jZSwgbiA9IDIwMDApCiMgY2FsY3VsYXRlIGEgUENBIG1hdHJpeCB1c2luZyB0aG9zZSBnZW5lcwpub3JtYWxpemVkX3NjZSA8LSBydW5QQ0Eobm9ybWFsaXplZF9zY2UsIHN1YnNldF9yb3cgPSBoaWdodmFyX2dlbmVzKQpgYGAKCk5vdyB3ZSBjYW4gcGxvdCBvdXIgbmV3IFBDQSB2YWx1ZXMgZm9yIGNvbXBhcmlzb24uCllvdSBtaWdodCByZWFsaXplIHRoYXQgb3VyIG9sZCBQQ0EgdmFsdWVzIHdlcmUgcmVwbGFjZWQgd2hlbiB3ZSByYW4gYHJ1blBDQSgpYCBhZ2Fpbiwgc28gd2UgY2FuJ3QgcmVjcmVhdGUgdGhlIGVhcmxpZXIgcGxvdHMgYXQgdGhpcyBzdGFnZSBvZiB0aGUgbm90ZWJvb2suCllvdSB3aWxsIGhhdmUgdG8gc2Nyb2xsIHVwIHRvIHlvdXIgZWFybGllciBwbG90cyB0byBjb21wYXJlLgoKYGBge3IgcGxvdFBDQV9oaWdodmFyLCBsaXZlID0gVFJVRX0KIyBwbG90IHRoZSBuZXcgUENBIHJlc3VsdHMKcGxvdFJlZHVjZWREaW0obm9ybWFsaXplZF9zY2UsICJQQ0EiKQpwbG90UmVkdWNlZERpbShub3JtYWxpemVkX3NjZSwgIlBDQSIsIG5jb21wb25lbnRzID0gYygzLDQpKQpgYGAKCiMjIyBVTUFQCgoqKlVNQVAqKiAoVW5pZm9ybSBNYW5pZm9sZCBBcHByb3hpbWF0aW9uIGFuZCBQcm9qZWN0aW9uKSBpcyBhIG1hY2hpbmUgbGVhcm5pbmcgdGVjaG5pcXVlIGRlc2lnbmVkIHRvIHByb3ZpZGUgbW9yZSBkZXRhaWwgaW4gaGlnaGx5IGRpbWVuc2lvbmFsIGRhdGEgdGhhbiBhIHR5cGljYWwgcHJpbmNpcGFsIGNvbXBvbmVudHMgYW5hbHlzaXMuCldoaWxlIFBDQSBhc3N1bWVzIHRoYXQgdGhlIHZhcmlhdGlvbiB3ZSBjYXJlIGFib3V0IGhhcyBhIHBhcnRpY3VsYXIgZGlzdHJpYnV0aW9uIChub3JtYWwsIGJyb2FkbHkgc3BlYWtpbmcpLCBVTUFQIGFsbG93cyBtb3JlIGNvbXBsaWNhdGVkIGRpc3RyaWJ1dGlvbnMgdGhhdCBpdCBsZWFybnMgZnJvbSB0aGUgZGF0YS4KVGhlIHVuZGVybHlpbmcgbWF0aGVtYXRpY3MgYXJlIGJleW9uZCBtZSwgYnV0IGlmIHlvdSBhcmUgbW9yZSBhbWJpdGlvdXMgdGhhbiBJLCB5b3UgY2FuIGxvb2sgYXQgdGhlIHBhcGVyIGJ5IFtNY0lubmVzLCBIZWFseSwgJiBNZWx2aWxsZSAoMjAxOCldKGh0dHBzOi8vYXJ4aXYub3JnL2Ficy8xODAyLjAzNDI2KS4KVGhlIG1haW4gYWR2YW50YWdlIG9mIHRoaXMgY2hhbmdlIGluIHVuZGVybHlpbmcgYXNzdW1wdGlvbnMgaXMgdGhhdCBVTUFQIGNhbiBkbyBhIGJldHRlciBqb2Igc2VwYXJhdGluZyBjbHVzdGVycywgZXNwZWNpYWxseSB3aGVuIHNvbWUgb2YgdGhvc2UgY2x1c3RlcnMgbWF5IGJlIG1vcmUgc2ltaWxhciB0byBlYWNoIG90aGVyIHRoYW4gb3RoZXJzLgoKQW5vdGhlciBkaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24gdGVjaG5pcXVlIHRoYXQgeW91IG1heSBoYXZlIGhlYXJkIG9mIGlzICoqdC1TTkUqKiAodC1kaXN0cmlidXRlZCBTdG9jaGFzdGljIE5laWdoYm9yIEVtYmVkZGluZyksIHdoaWNoIGhhcyBzaW1pbGFyIHByb3BlcnRpZXMgdG8gVU1BUCwgYW5kIG9mdGVuIHByb2R1Y2VzIHNpbWlsYXIgcmVzdWx0cy4KVGhlcmUgaXMgc29tZSBvbmdvaW5nIGRlYmF0ZSBhYm91dCB3aGljaCBvZiB0aGVzZSB0d28gdGVjaG5pcXVlcyBpcyBzdXBlcmlvciwgYW5kIHdoZXRoZXIgdGhlIGRpZmZlcmVuY2VzIGFyZSBkdWUgdG8gdGhlIHVuZGVybHlpbmcgYWxnb3JpdGhtIG9yIHRvIGltcGxlbWVudGF0aW9uIGFuZCBwYXJhbWV0ZXIgaW5pdGlhbGl6YXRpb24gZGVmYXVsdHMuClJlZ2FyZGxlc3Mgb2Ygd2h5LCBpbiBvdXIgZXhwZXJpZW5jZSwgVU1BUCBzZWVtcyB0byBwcm9kdWNlIHNsaWdodGx5IGJldHRlciByZXN1bHRzIGFuZCBydW4gYSBiaXQgZmFzdGVyLCBidXQgdGhlIGRpZmZlcmVuY2VzIGNhbiBiZSBzdWJ0bGUuCgojIyMjIERlZmF1bHQgcGFyYW1ldGVycwoKRm9yIGVhc2Ugb2YgdXNlIHdpdGggdGhpcyBkYXRhLCB3ZSB3aWxsIGJlIHVzaW5nIHRoZSBgc2NhdGVyOjpjYWxjdWxhdGVVTUFQKClgIGFuZCBgc2NhdGVyOjpydW5VTUFQKClgIGZ1bmN0aW9uIHRvIGFwcGx5IFVNQVAgdG8gb3VyIHNpbmdsZSBjZWxsIGRhdGEsIGJ1dCBzaW1pbGFyIGZ1bmN0aW9ucyB0aGUgYHV3b3RgIHBhY2thZ2UgKG5vdGFibHkgYHV3b3Q6OnVtYXAoKWApIGNhbiBiZSB1c2VkIHRvIGFwcGx5IFVNQVAgdG8gYW55IG51bWVyaWNhbCBtYXRyaXguCgpVTUFQIGNhbiBiZSBzbG93IGZvciBhIGxhcmdlIGRhdGEgc2V0IHdpdGggbG90cyBvZiBwYXJhbWV0ZXJzLgpJdCBpcyB3b3J0aCBub3RpbmcgdGhhdCB0aGUgYHNjYXRlcjo6Y2FsY3VsYXRlVU1BUCgpYCBpbXBsZW1lbnRhdGlvbiBhY3R1YWxseSBkb2VzIFBDQSBmaXJzdCwgYW5kIHRoZW4gcnVucyBVTUFQIG9uIHRoZSB0b3AgNTAgUENzLgpJZiB3ZSBoYXZlIGFscmVhZHkgY2FsY3VsYXRlZCBQQ0EgKGFzIHdlIGhhdmUpIHdlIGNhbiB0ZWxsIGl0IHRvIHVzZSB0aG9zZSByZXN1bHRzIHdpdGggdGhlIGBkaW1yZWRgIGFyZ3VtZW50LgoKQXMgd2l0aCBQQ0EsIHRoZXJlIGFyZSB0d28gZnVuY3Rpb25zIHdlIGNvdWxkIHVzZToKYHNjYXRlcjo6Y2FsY3VsYXRlVU1BUCgpYCB3aWxsIHJldHVybiBhIG1hdHJpeCBvZiByZXN1bHRzLCB3aXRoIG9uZSByb3cgZm9yIGVhY2ggc2FtcGxlLCBhbmQgYSBjb2x1bW4gZm9yIGVhY2ggb2YgdGhlIFVNQVAgZGltZW5zaW9ucyByZXR1cm5lZC4KYHNjYXRlcjo6cnVuVU1BUCgpYCBwZXJmb3JtcyB0aGUgc2FtZSBmdW5jdGlvbiwgYnV0IHJldHVybnMgdGhlIHJlc3VsdHMgaW4gYSBTaW5nbGVDZWxsRXhwZXJpbWVudCBvYmplY3QuCgpMZXQncyBzZWUgaG93IGl0IGxvb2tzIHdpdGggdGhlIChtb3N0bHkpIGRlZmF1bHQgcGFyYW1ldGVyczoKCmBgYHtyIGNhbGN1bGF0ZV91bWFwLCBsaXZlID0gVFJVRX0KIyBSdW4gVU1BUApub3JtYWxpemVkX3NjZSA8LSBydW5VTUFQKG5vcm1hbGl6ZWRfc2NlLAogICAgICAgICAgICAgICAgICAgICAgICAgIGRpbXJlZCA9ICJQQ0EiKSAjIHVzZSBhbHJlYWR5IHN0b3JlZCBQQ0EgcmVzdWx0cwpgYGAKCk5vdyB3ZSBjYW4gcGxvdCB3aXRoIHRoZSBzYW1lIGBwbG90UmVkdWNlZERpbSgpYCBmdW5jdGlvbiwgc3BlY2lmeWluZyB3ZSB3YW50IHRvIHBsb3QgdGhlIFVNQVAgcmVzdWx0cyB0aGlzIHRpbWUuCldlIHdpbGwgYWxzbyBhZGQgc29tZSBjb2xvciB0aGlzIHRpbWUgd2l0aCB0aGUgYGNvbG9yX2J5YCBhcmd1bWVudCwgdXNpbmcgdGhlIG51bWJlciBvZiBnZW5lcyBkZXRlY3RlZCBpbiBlYWNoIGNlbGwgdG8gYXNzaWduIGEgaHVlLgoKYGBge3IgcGxvdF91bWFwLCBsaXZlID0gVFJVRX0KIyBtYWtlIGEgVU1BUCBwbG90IHdpdGggYHBsb3RSZWR1Y2VkRGltKClgCnBsb3RSZWR1Y2VkRGltKG5vcm1hbGl6ZWRfc2NlLCAiVU1BUCIsIGNvbG9yX2J5ID0gImRldGVjdGVkIikKYGBgCgpUaGVyZSBpcyBjbGVhcmx5IGEgbG90IG9mIHN0cnVjdHVyZSBpbiB0aGVyZSwgYnV0IGlzIGl0IG1lYW5pbmdmdWw/CkRvIHRoZSBjbHVzdGVycyB3ZSBzZWUgZGlmZmVyZW50aWF0ZSBjZWxsIHR5cGVzPyBIb3cgc2hvdWxkIHdlIGRpdmlkZSB0aGVtIHVwPwoKV2Ugd2lsbCBjb21lIGJhY2sgdG8gdGhpcyBxdWVzdGlvbiBsYXRlciEKCiMjIyBVTUFQIGV4cGVyaW1lbnRzCgpOb3cgdGhhdCB3ZSBoYXZlIGFuIGlkZWEgb2Ygd2hhdCBhIFVNQVAgcGxvdCB3aXRoIHRoZSBkZWZhdWx0IHBhcmFtZXRlcnMgbG9va3MgbGlrZSwgbGV0J3MgdHJ5IGV4cGVyaW1lbnRpbmcgd2l0aCB0aGUgYG5fbmVpZ2hib3JzYCBwYXJhbWV0ZXIuCkZpcnN0LCB3ZSBzaG91bGQgc2VlIHdoYXQgdGhpcyBwYXJhbWV0ZXIgaXMsIGFuZCB3aGF0IHRoZSBkZWZhdWx0IHZhbHVlIGlzLgpJbiB0aGUgY29uc29sZSwgcnVuIGA/c2NhdGVyOjpjYWxjdWxhdGVVTUFQYCB0byBzZWUgd2hhdCB0aGlzIChhbmQgb3RoZXIgcGFyYW1ldGVycykgYXJlLgpGb3IgZXZlbiBtb3JlIHBhcmFtZXRlcnMsIHlvdSBjYW4gbG9vayBhdCB0aGUgdW5kZXJseWluZyBpbXBsZW1lbnRhdGlvbiBjb2RlIHRoYXQgYGNhbGN1bGF0ZVVNQVAoKWAgdXNlcywgd2hpY2ggaXMgdGhlIGZ1bmN0aW9uIGB1d290Ojp1bWFwKClgCgpJbiBvcmRlciB0byBtYWtlIG91ciBleHBlcmltZW50YXRpb24gZWFzaWVyLCB3ZSB3aWxsIGNyZWF0ZSBhICpmdW5jdGlvbiogdGhhdCBhbGxvd3MgdXMgdG8gcmVydW4gdGhlIHNhbWUgY29kZSBlYXNpbHksIGJ1dCBjcmVhdGUgYW4gYXJndW1lbnQgdGhhdCBhbGxvd3MgdXMgdG8gY2hhbmdlIG9uZSB2YXJpYWJsZTogdGhlIGBuX25laWdoYm9yc2AgdmFyaWFibGUuCkhlcmUgd2UgYXJlIHNhdmluZyBvbmx5IGEgbGluZSBvZiBjb2RlLCBidXQgd2UgY291bGQgYXBwbHkgdGhpcyB0byBhIG11Y2ggbW9yZSBjb21wbGV4IHNlcmllcyBvZiBvcGVyYXRpb25zIGlmIHdlIHdhbnRlZCB0byEKCmBgYHtyIFVNQVAtZnVuY3Rpb259ClVNQVBfcGxvdF93cmFwcGVyIDwtIGZ1bmN0aW9uKHNjZSA9IG5vcm1hbGl6ZWRfc2NlLCBubl9wYXJhbSA9IDE1KSB7CiAgIyBQdXJwb3NlOiBSdW4gVU1BUCBhbmQgcGxvdCB0aGUgb3V0cHV0CiAgIyBBcmdzOiBubl9wYXJhbTogYSBzaW5nbGUgbnVtZXJpYyBhcmd1bWVudCB0aGF0IHdpbGwgY2hhbmdlIHRoZQogICMgICAgICAgICAgICAgICAgIG5fbmVpZ2hib3JzIHZhcmlhYmxlIGluIHRoZSBjYWxjdWxhdGVVTUFQKCkgZnVuY3Rpb24uCiAgIyBPdXRwdXQ6IGEgc2NhdHRlcnBsb3Qgd2l0aCB0aGUgdHdvIFVNQVAgY29vcmRpbmF0ZXMgcGxvdHRlZCBhbmQKICAjICAgICAgICAgY2VsbC10eXBlcyBsYWJlbGVkIHdpdGggZGF0YSBwb2ludCBjb2xvcnMuCgogICMgUnVuIFVNQVAgd2l0aCBhIHNwZWNpZmllZCBuX25laWdoYm9ycyBwYXJhbWV0ZXIKICBzY2VfdW1hcCA8LSBzY2F0ZXI6OnJ1blVNQVAoc2NlLCBkaW1yZWQgPSAiUENBIiwgbl9uZWlnaGJvcnMgPSBubl9wYXJhbSkKICBzY2F0ZXI6OnBsb3RSZWR1Y2VkRGltKHNjZV91bWFwLCAiVU1BUCIsIGNvbG9yX2J5ID0gImRldGVjdGVkIikgKwogICAgIyBtYWtlIHRoZSBsZWdlbmQgbGFiZWwgbW9yZSBpbmZvcm1hdGl2ZSAodGhpcyBpcyBnZ3Bsb3QyIGNvZGUhKQogICAgZ3VpZGVzKGNvbG9yID0gZ3VpZGVfY29sb3JiYXIodGl0bGU9ImdlbmVzXG5leHByZXNzZWQiKSkKfQpgYGAKCkxldCdzIG1ha2Ugc3VyZSB0aGF0IHdvcmtzIGFuZCBnaXZlcyB0aGUgc2FtZSByZXN1bHQgYXMgYmVmb3JlIHdoZW4gd2UgdXNlIHRoZSBkZWZhdWx0IHBhcmFtZXRlcnMuCgpgYGB7ciBmdW5jdGlvbi10ZXN0fQpVTUFQX3Bsb3Rfd3JhcHBlcihubl9wYXJhbSA9IDE1KQpgYGAKCipLaW5kIG9mPyoKClRoaXMgaXNuJ3QgeW91ciBmYXVsdCEKVU1BUCBpcyBhIG5vbi1kZXRlcm1pbmlzdGljIGZ1bmN0aW9uLCB3aGljaCBtZWFucyB0aGF0IHRoZXJlIGlzIGEgcmFuZG9tIGNvbXBvbmVudCB0byB0aGUgcmVzdWx0cy4KV2UgY2FuIHVzZSBgc2V0LnNlZWQoKWAgdG8gYmUgc3VyZSB0aGF0IGFuIGluZGl2aWR1YWwgcnVuIChvciBzZXQgb2YgcnVucykgaXMgdGhlIHNhbWUgZXZlcnkgdGltZSB5b3UgcnVuIHlvdXIgYW5hbHlzaXMsIGJ1dCBpdCBpcyBpbXBvcnRhbnQgdG8gY2hlY2sgeW91ciByZXN1bHRzIGEgZmV3IHRpbWVzIHdpdGggZGlmZmVyZW50IHJhbmRvbSBzdGFydGluZyBwb2ludHMgdG8gYmUgc3VyZSB0aGF0IHRoZSByYW5kb20gY29tcG9uZW50IGlzIG5vdCBnaXZpbmcgeW91IGFub21hbG91cyByZXN1bHRzLgpTZXR0aW5nIGEgZGlmZmVyZW50IHJhbmRvbSBudW1iZXIgc2VlZCB3aXRoIGBzZXQuc2VlZCgpYCBpcyBvbmUgd2F5IHRvIGRvIHRoaXMsIG9yIHlvdSBjYW4gcnVuIHRoZSBhbmFseXNpcyBtdWx0aXBsZSB0aW1lcyBpbiB0aGUgc2FtZSBzZXNzaW9uLCBhcyB3ZSBoYXZlIGRvbmUgaGVyZS4KCkZpbGwgaW4gdGhlIG5leHQgZmV3IGNvZGUgY2h1bmtzIHdpdGggdGhlIGZ1bmN0aW9uIGFuZCB0aGUgYG5fbmVpZ2hib3JzYCBhcmd1bWVudCB5b3Ugd291bGQgbGlrZSB0byB1c2UgZm9yIGVhY2guCihGZWVsIGZyZWUgdG8gYWRkIG1vcmUgdGVzdHMhKQpUaGVuIHJ1biB0aGUgY2h1bmtzIGFuZCBjb21wYXJlIHlvdXIgb3V0cHV0IGdyYXBocy4KCmBgYHtyIHJ1bi1VTUFQLTEsIGxpdmUgPSBUUlVFfQojIFRyeSBzb21ldGhpbmcgbG93PwpVTUFQX3Bsb3Rfd3JhcHBlcihubl9wYXJhbSA9IDMpCmBgYAoKYGBge3IgcnVuLVVNQVAtMiwgbGl2ZSA9IFRSVUV9CiMgVHJ5IHNvbWV0aGluZyBoaWdoPwpVTUFQX3Bsb3Rfd3JhcHBlcihubl9wYXJhbSA9IDEwMCkKYGBgCgpgYGB7ciBydW4tVU1BUC0zLCBsaXZlID0gVFJVRX0KIyBUcnkgd2hhdGV2ZXIgeW91IGxpa2UhClVNQVBfcGxvdF93cmFwcGVyKG5uX3BhcmFtID0gNSkKYGBgCgojIyMjIFNvbWUgJ2JpZyBwaWN0dXJlJyB0aG91Z2h0cyB0byB0YWtlIGZyb20gdGhpcyBleHBlcmltZW50OgoKMS4gQW5hbHlzZXMgc3VjaCBhcyBVTUFQIGhhdmUgdmFyaW91cyBsaW1pdGF0aW9ucyBmb3IgaW50ZXJwcmV0YWJpbGl0eS4KVGhlIGNvb3JkaW5hdGVzIG9mIFVNQVAgb3V0cHV0IGZvciBhbnkgZ2l2ZW4gY2VsbCBjYW4gY2hhbmdlIGRyYW1hdGljYWxseSBkZXBlbmRpbmcgb24gcGFyYW1ldGVycywgYW5kIGV2ZW4gcnVuIHRvIHJ1biB3aXRoIHRoZSBzYW1lIHBhcmFtZXRlcnMuClRoaXMgcHJvYmFibHkgbWVhbnMgdGhhdCB5b3Ugc2hvdWxkbid0IHJlbHkgb24gdGhlIGV4YWN0IHZhbHVlcyBvZiBVTUFQJ3Mgb3V0cHV0LgoKICAgIC0gT25lIHBhcnRpY3VsYXIgbGltaXRhdGlvbiBvZiBVTUFQIChhbmQgdC1TTkUpIGlzIHRoYXQgd2hpbGUgb2JzZXJ2ZWQgY2x1c3RlcnMgaGF2ZSBzb21lIG1lYW5pbmcsIHRoZSBkaXN0YW5jZSAqYmV0d2VlbiogY2x1c3RlcnMgdXN1YWxseSBkb2VzIG5vdCAobm9yIGRvZXMgY2x1c3RlciBkZW5zaXR5KS4KICAgIFRoZSBmYWN0IHRoYXQgdHdvIGNsdXN0ZXJzIGFyZSBuZWFyIGVhY2ggb3RoZXIgc2hvdWxkIE5PVCBiZSBpbnRlcnByZXRlZCB0byBtZWFuIHRoYXQgdGhleSBhcmUgbW9yZSByZWxhdGVkIHRvIGVhY2ggb3RoZXIgdGhhbiB0byBtb3JlIGRpc3RhbnQgY2x1c3RlcnMuCiAgICAoVGhlcmUgaXMgc29tZSBkaXNhZ3JlZW1lbnQgYWJvdXQgd2hldGhlciBVTUFQIGRpc3RhbmNlcyBoYXZlIG1vcmUgbWVhbmluZywgYnV0IGl0IGlzIHByb2JhYmx5IHNhZmVyIHRvIGFzc3VtZSB0aGV5IGRvbid0LikKCgoyLiBQbGF5aW5nIHdpdGggcGFyYW1ldGVycyBzbyB5b3UgY2FuIGZpbmUtdHVuZSB0aGVtIGlzIGEgZ29vZCB3YXkgdG8gZ2l2ZSB5b3UgbW9yZSBpbmZvcm1hdGlvbiBhYm91dCBhIHBhcnRpY3VsYXIgYW5hbHlzaXMgYXMgd2VsbCBhcyB0aGUgZGF0YSBpdHNlbGYuCgozLiBXaGVyZSByZXN1bHRzIGFyZSBjb25zaXN0ZW50LCB0aGV5IGFyZSBtb3JlIGxpa2VseSB0byBoYXZlIG1lYW5pbmcuCldoaWxlIHdlIGRvIG5vdCBoYXZlIGxhYmVsZWQgY2VsbCB0eXBlcyBpbiB0aGlzIGNhc2UsIHRoZXJlIGRvZXMgc2VlbSB0byBiZSBzb21lIGNvbnNpc3RlbmN5IG9mIHRoZSBvdmVyYWxsIHBhdHRlcm5zIHRoYXQgd2Ugc2VlIChpZiBub3QgcHJlY2lzZSB2YWx1ZXMpLCBhbmQgdGhpcyBsaWtlbHkgcmVmbGVjdHMgYmlvbG9naWNhbCBpbmZvcm1hdGlvbiAob3IgdGVjaG5pY2FsIGFydGlmYWN0cykuCgpJbiBzdW1tYXJ5LCBpZiB0aGUgcmVzdWx0cyBvZiBhbiBhbmFseXNpcyBjYW4gYmUgY29tcGxldGVseSBjaGFuZ2VkIGJ5IGNoYW5naW5nIGl0cyBwYXJhbWV0ZXJzLCB5b3Ugc2hvdWxkIGJlIG1vcmUgY2F1dGlvdXMgd2hlbiBpdCBjb21lcyB0byB0aGUgY29uY2x1c2lvbnMgeW91IGRyYXcgZnJvbSBpdCBhcyB3ZWxsIGFzIGhhdmluZyBnb29kIHJhdGlvbmFsZSBmb3IgdGhlIHBhcmFtZXRlcnMgeW91IGNob29zZS4KCiMjIyB0LVNORSBjb21wYXJpc29uCgpJbiB0aGUgYmxvY2sgYmVsb3cgaXMgYSBzaW1pbGFyIGFuYWx5c2lzIGFuZCBwbG90IHdpdGggdC1TTkUgKHQtZGlzdHJpYnV0ZWQgU3RvY2hhc3RpYyBOZWlnaGJvciBFbWJlZGRpbmcpLgpOb3RlIHRoYXQgdGhpcyBhbmFseXNpcyBhbHNvIHVzZXMgUENBIGJlZm9yZSBtb3Zpbmcgb24gdG8gdGhlIGZhbmN5IG1hY2hpbmUgbGVhcm5pbmcuCgpgYGB7ciB0c25lLCBsaXZlID0gVFJVRX0KIyBSdW4gVFNORQpub3JtYWxpemVkX3NjZSA8LSBydW5UU05FKG5vcm1hbGl6ZWRfc2NlLCBkaW1yZWQgPSAiUENBIikKCiMgcGxvdCB3aXRoIHNjYXRlciBmdW5jdGlvbgpwbG90UmVkdWNlZERpbShub3JtYWxpemVkX3NjZSwgIlRTTkUiLCBjb2xvcl9ieSA9ICJkZXRlY3RlZCIpCmBgYAoKRGlmZmVyZW50ISAoU2xvd2VyISkgSXMgaXQgYmV0dGVyIG9yIHdvcnNlPyBIYXJkIHRvIHNheSEKRGlmZmVyZW50IHBlb3BsZSBsaWtlIGRpZmZlcmVudCB0aGluZ3MsIGFuZCBvbmUgcGxvdCBtaWdodCBpbGx1c3RyYXRlIGEgcGFydGljdWxhciBwb2ludCBiZXR0ZXIgdGhhbiBhbm90aGVyLgoKIyMgU2F2ZSByZXN1bHRzCgpXZSBhcmUgZ29pbmcgdG8gdXNlIHRoaXMgZGF0YSBtb3JlIGluIHRoZSBuZXh0IG5vdGVib29rLCBzbyBsZXQncyBzYXZlIGl0IGFzIGFuIGBSRFNgIGZpbGUuCgpgYGB7ciBzYXZlfQpyZWFkcjo6d3JpdGVfcmRzKG5vcm1hbGl6ZWRfc2NlLCBmaWxlID0gb3V0cHV0X3NjZV9maWxlKQpgYGAKCgojIyMgU29tZSBmdXJ0aGVyIHJlYWRpbmcgb24gZGltZW5zaW9uIHJlZHVjdGlvbjoKCi0gVGhpcyB3ZWJzaXRlIGV4cGxhaW5zIFtQQ0EgdmlzdWFsbHldKGh0dHA6Ly9zZXRvc2EuaW8vZXYvcHJpbmNpcGFsLWNvbXBvbmVudC1hbmFseXNpcy8pLgotIFtCZWNodCAqZXQgYWwuKiAoMjAxOCldKGh0dHBzOi8vd3d3Lm5hdHVyZS5jb20vYXJ0aWNsZXMvbmJ0LjQzMTQpIGRpc2N1c3NlcyB1c2luZyBbVU1BUF0oaHR0cHM6Ly9naXRodWIuY29tL2xtY2lubmVzL3VtYXApIGZvciBzaW5nbGUtY2VsbCBkYXRhLgotIFtXYXR0ZW5iZXJnICpldCBhbC4qICgyMDE2KV0oaHR0cHM6Ly9kaXN0aWxsLnB1Yi8yMDE2L21pc3JlYWQtdHNuZS8pIGRpc2N1c3MgaG93IHRvIHVzZSB0LVNORSBwcm9wZXJseSB3aXRoIGdyZWF0IHZpc3VhbHMuCihUaGUgbGVzc29ucyBhcHBseSB0byBVTUFQIGFzIHdlbGwsIHdpdGggYSBicm9hZCBzdWJzdGl0dXRpb24gb2YgdGhlIGBuX25laWdoYm9yc2AgcGFyYW1ldGVyIGZvciBgcGVycGxleGl0eWAuKQotIFtOZ3V5ZW4gJiBIb2xtZXMgKDIwMTkpXShodHRwczovL2pvdXJuYWxzLnBsb3Mub3JnL3Bsb3Njb21wYmlvbC9hcnRpY2xlP2lkPTEwLjEzNzEvam91cm5hbC5wY2JpLjEwMDY5MDcpIGxheSBvdXQgZ3VpZGVsaW5lcyBvbiBjaG9vc2luZyBkaW1lbnNpb25zIHJlZHVjdGlvbiBtZXRob2RzLgotIFtGcmVpdGFnICgyMDE5KV0oaHR0cHM6Ly9ycHVicy5jb20vU2Fza2lhLzUyMDIxNikgaXMgYSBuaWNlIGV4cGxhbmF0aW9uIGFuZCBjb21wYXJpc29uIG9mIG1hbnkgZGlmZmVyZW50IGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiB0ZWNobmlxdWVzIHRoYXQgeW91IG1heSBlbmNvdW50ZXIuCgoKIyMgU2Vzc2lvbiBJbmZvCgpgYGB7ciBzZXNzaW9ufQpzZXNzaW9uSW5mbygpCmBgYAo=
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/05-clustering_markers_scRNA.nb.html b/completed-notebooks/scRNA-seq/05-clustering_markers_scRNA.nb.html new file mode 100644 index 0000000..c4f90cf --- /dev/null +++ b/completed-notebooks/scRNA-seq/05-clustering_markers_scRNA.nb.html @@ -0,0 +1,3873 @@ + + + + + + + + + + + + + + + +Clustering cells and finding marker genes from scRNA-seq data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Identify clusters of cells in single-cell data
  • +
  • Compare results from different clustering methods
  • +
  • Select putative marker genes that can be used to differentiate +clusters
  • +
+
+
+
+

Set Up

+
+

Load libraries

+ + + +
# Load libraries
+library(ggplot2)
+library(scater)
+ + +
Loading required package: SingleCellExperiment
+ + +
Loading required package: SummarizedExperiment
+ + +
Loading required package: MatrixGenerics
+ + +
Loading required package: matrixStats
+ + +

+Attaching package: 'MatrixGenerics'
+ + +
The following objects are masked from 'package:matrixStats':
+
+    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
+    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
+    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
+    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
+    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
+    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
+    colWeightedMeans, colWeightedMedians, colWeightedSds,
+    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
+    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
+    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
+    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
+    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
+    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
+    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
+    rowWeightedSds, rowWeightedVars
+ + +
Loading required package: GenomicRanges
+ + +
Loading required package: stats4
+ + +
Loading required package: BiocGenerics
+ + +

+Attaching package: 'BiocGenerics'
+ + +
The following objects are masked from 'package:stats':
+
+    IQR, mad, sd, var, xtabs
+ + +
The following objects are masked from 'package:base':
+
+    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
+    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
+    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
+    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
+    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
+    tapply, union, unique, unsplit, which.max, which.min
+ + +
Loading required package: S4Vectors
+ + +

+Attaching package: 'S4Vectors'
+ + +
The following object is masked from 'package:utils':
+
+    findMatches
+ + +
The following objects are masked from 'package:base':
+
+    expand.grid, I, unname
+ + +
Loading required package: IRanges
+ + +
Loading required package: GenomeInfoDb
+ + +
Loading required package: Biobase
+ + +
Welcome to Bioconductor
+
+    Vignettes contain introductory material; view with
+    'browseVignettes()'. To cite Bioconductor, see
+    'citation("Biobase")', and for packages 'citation("pkgname")'.
+ + +

+Attaching package: 'Biobase'
+ + +
The following object is masked from 'package:MatrixGenerics':
+
+    rowMedians
+ + +
The following objects are masked from 'package:matrixStats':
+
+    anyMissing, rowMedians
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
+ + +
Loading required package: scuttle
+ + +
library(scran)
+
+# clustering tools
+library(bluster)
+
+# Setting the seed for reproducibility
+set.seed(12345)
+ + + + + + +
# main data directory
+data_dir <- file.path("data", "hodgkins")
+
+# normalized data file
+normalized_rds <- file.path(data_dir, "normalized", "normalized_hodgkins_sce.rds")
+
+# Output directory for markers
+marker_dir <- file.path("analysis", "hodgkins", "markers")
+fs::dir_create(marker_dir)
+ + + + + + +
hodgkins_sce <- readr::read_rds(normalized_rds)
+ + + +
+
+
+

Assigning cell clusters

+
+Roadmap: Cluster +
Roadmap: Cluster
+
+

When we performed dimensionality reduction on our single cell data, +we could see visually that the cells tended cluster together into +groups. To the extent that such clustering is a real biological +phenomenon, representing cells with similar patterns of gene expression, +we might like to identify distinct groups that we can name and assign a +label to. Ultimately, we would hope that these labels correspond to +previously identified (or newly identified!) cell types, and that we can +use that information to provide more insight into the results of our +experiment.

+

There are a number of methods to identify clusters and assign cells +to those in multidimensional data like the single cell data we have. We +will explore a couple of the more common methods here.

+
+

k-means clustering

+

The first method we will try is k-means clustering. The +k here refers to the number of clusters that we will +create, and must be chosen before we start. This clustering method seeks +to find a way to divide the cells into k clusters such that +the cells within each cluster are as similar as possible and the +differences among clusters is as large as possible.

+

It turns out that is a pretty hard problem to solve exactly, but we +can do pretty well with an algorithm that starts with a random guess at +where the clusters will be:

+
    +
  1. We start by picking random center locations (how we do this can +vary)
  2. +
  3. Then, we assign cells to clusters by finding which center is closest +to each cell.
  4. +
  5. Next we find the centers of these new clusters
  6. +
  7. Go back to step 2 with these new centers, repeating until the +cluster assignments stop changing.
  8. +
+

You might wonder: How many clusters should we use? That is a hard +question! There are some heuristics we can use for deciding the +“correct” number of clusters, but we will not be exploring those right +now.

+

For an intuitive visualization of the general k-means method, you +might find this +StatQuest video useful, and for more discussion of the method in a +single-cell context, the Orchestrating +Single-Cell Analysis book section on k-means is a good +reference.

+

We are going to use the function clusterRows() from the +Bioconductor bluster package for our clustering. This +function takes a matrix where each sample (cell in our case) is a row +and each column is a feature. The matrix of counts (or normalized +counts) by cells in our SingleCellExperiment object is the +wrong orientation, so at a minimum we would have to transpose that +matrix before proceeding.

+

However, clustering algorithms like k-means can be a bit slow with as +many features as the number of genes that we have in our data set, so we +would rather not use the raw data. There is also a potential concern +that noise in the raw data might disrupt the clustering algorithm, so it +would be best to use some kind of dimensionality reduction algorithm +first. We still want to maintain a good number of dimensions, so our old +friend PCA is a good (and very standard) choice.

+

Thankfully, we already computed and stored a matrix with +reduced dimensions with the runPCA() function. We will +extract that from the SingleCellExperiment object with the +reducedDim() function, which conveniently returns a matrix +with the cells as rows, so we can use that directly!

+

The other argument we need for clusterRows() will tell +it which clustering algorithm to use, and any additional parameters +associated with that algorithm. This has to be made with a special kind +of function from the bluster package. In this case, we will +use KmeansParam() to specify that we want k-means +clustering, with the centers parameter to set how many +clusters we will assign (k).

+ + + +
# set the number of clusters
+k <- 7
+
+# extract the principal components matrix
+hodgkins_pca <- reducedDim(hodgkins_sce, "PCA")
+
+# perform the clustering
+kclusters <- clusterRows(hodgkins_pca, KmeansParam(centers = k))
+ + + +

The clusterRows() function returned a vector of cluster +assignments as integers, but the numerical values have no inherent +meaning. For plotting we will want to convert those to a factor, so R is +not tempted to treat them as a continuous variable.

+

We can also store them back into the column (cell) information table +of the original object for convenient storage and later use.

+ + + +
# save clusters in the SCE object as a factor
+hodgkins_sce$kcluster <- factor(kclusters)
+ + + +

Now we can plot the results and see how the clustering looks, using +the scater function plotReducedDim() that we +have used before, coloring the points by our clustering results. We will +start by using the UMAP coordinates for the plot. Note that this does +require that the cluster identities were stored in the +SingleCellExperiment object, as we just did.

+ + + +
# plot clustering results
+plotReducedDim(hodgkins_sce, "UMAP", color_by = "kcluster")
+ + +

+ + + +
    +
  • Do those clusters line up with what you might have expected if you +were doing this by eye?
  • +
  • If we repeat this, do we get the same cluster assignments?
  • +
  • What happens if we change the number of clusters?
  • +
  • What do the results look like if you plot with the PCA +or TSNE coordinates?
  • +
+

You will have time to explore questions like these in the exercise +notebooks. One thing worth noting right away though is that cluster +numbers here and in the future are assigned arbitrarily. Even if we got +exactly the same logical clusters across runs (unlikely!), we wouldn’t +expect the label numbers to be the same or stable.

+
+
+

Graph-based clustering

+

Another common type of clustering method for single cell data is +graph-based clustering. This algorithm follows the following general +steps:

+
    +
  1. Identifying a set of nearest neighbors for each cell that have +similar expression profiles to that cell.
  2. +
  3. Connect each cell to its neighbors in a network graph, weighting the +connections by how similar the connected cells are.
  4. +
  5. Break the network up by identifying clusters of cells that are more +connected to each other than they are to cells outside the +clusters.
  6. +
+

There is a lot of hidden detail in those three steps!

+

To apply this clustering algorithm, we will use the same +bluster::clusterRows() function as before, but we will +change the second argument from KmeansParam() to +NNGraphParam() to tell it that we want to use a +nearest-neighbor graph-based method. We can then supply additional +parameters to NNGraphParam() to adjust the details of the +algorithm. Here we will use k to specify the number of +neighbors to use when building the graph and cluster.fun to +specify the algorithm for identifying the clusters within the graph.

+
    +
  • Despite sharing a letter, k here and the one from +k-means clustering are not the same thing! In this case, we are telling +the algorithm how many neighbor connections to make for each cell, not +the final number of clusters, which will be determined by the algorithm +we use for the cluster building step.

  • +
  • The options for cluster.fun describe the algorithm +for the cluster building step described above. These include +walktrap (the default), leiden, and +louvain, which is the default algorithm in Seurat, another +common package for single cell analysis that you may have seen.

  • +
+

In the example below, we will use the default values for these two +arguments.

+ + + +
# run the clustering algorithm
+nnclusters <- clusterRows(
+  hodgkins_pca,
+  NNGraphParam(k = 10,
+               cluster.fun = "walktrap")
+  )
+# store cluster results in the SCE object
+hodgkins_sce$nncluster <- factor(nnclusters)
+ + + +

Now we can plot the results of our graph-based clustering. This time +we will also use the text_by argument to include the +cluster ids directly on the plot.

+ + + +
plotReducedDim(hodgkins_sce,
+               "UMAP",
+               color_by = "nncluster",
+               text_by = "nncluster")
+ + +

+ + + +
    +
  • How do these results compare to the k-means clustering result?
  • +
  • How sensitive is this to the parameters we choose?
  • +
  • How do the numbers of clusters change with different +parameters?
  • +
+

Again, you will have time to explore these more in the exercise +notebook, and of course with your own data! Sadly, there are not always +good answers to which set of inferred clusters is best! Which method and +parameters you use may depend on the kind of question you are trying to +answer.

+

For more detailed information about the methods presented here, +including some ways to assess the “quality” of the clustering, I +encourage you to explore at the relevant chapter of the Orchestrating +Single-Cell Analysis book. A recent review by Kislev et al. +(2019) also goes into some depth about the differences among +algorithms and the general challenges associated with clustering single +cell data.

+
+
+
+

Identifying marker genes

+
+Roadmap: Find markers +
Roadmap: Find markers
+
+

Assigning clusters is nice for visualization, but we would also like +to be able to move toward a biological interpretation of the clusters +and identifying the cell types in each cluster. To that end, we can +identify marker genes that are differentially expressed among +clusters.

+

It is worth noting here that the statistical calculations here are +more than a bit circular: we identified clusters first based on gene +expression, then we are using those same clusters to find differences in +gene expression. The result is that even if there were no true +clusters, we would always find marker genes! For a much more technical +exploration of this circularity (and a method to correct for it), see a +preprint by Gao et +al. (2020). In light of this, it is better to think about marker +gene identification as an aid in interpreting the clustering results +(and possibly extending insights to new data sets), rather than results +that should be interpreted on their own, and we should be extremely wary +of justifying cluster assignments solely based on these results! With +that caveat, let’s proceed.

+

To identify marker genes, we will use the +scran::findMarkers() function, which will rank genes by +their differential expression by calculating pairwise statistics among +clusters. We have a few options for how to determine the gene rankings +and marker gene list for each cluster. At one end could include genes +that are differentially expressed in any pairwise comparison +against our focal cluster, or at the other we could only include genes +that are differentially expressed in all comparisons with that +cluster. We could also do something in between, including genes that +differentiate the focal cluster from some fraction of the other +clusters. For now, we will use the findMarkers() function +to rank the genes in each cluster by their combined scores against +all other clusters, using the pval.type +argument.

+

findMarkers() will return a list (technically a +list-like object) of tables, one for each cell type, with statistics for +each gene showing how well it differentiates that cell type against +other types.

+ + + +
# use `findMarkers()` to calculate how well each gene
+#  differentiates each cluster from *all* other clusters
+markers <- scran::findMarkers(hodgkins_sce,
+                              groups = hodgkins_sce$nncluster,
+                              pval.type = "all")
+ + + +

Next we can look at one of those tables. We will start with the first +cluster, which we will select from the list using the R standard double +bracket [[1]] notation. We also doing a bit of +transformation here to pull the gene name into a column of its own.

+ + + +
markers[[1]] |>
+  as.data.frame() |> # convert to a data frame
+  tibble::rownames_to_column("gene") # make gene a column
+ +
+ +
+ + +

You can see that this table includes values for all genes, so we +would like to make a shorter list.

+

Because we tend to like tidy data, here we use +a tidyverse function from the purrr package to +apply the same operations as above to every element of the +markers list. We will introduce purrr briefly +here, but if you want more information and background, we recommend the +purrr +cheatsheet (PDF) and Jenny Bryan’s great purrr +tutorial.

+

The main functions in purrr are the map() +functions, which take as their main arguments a list +and a function to apply to each element of the list. +The main function is purrr::map(); if you are familiar with +the base R lapply() function, it is very similar, but with +some different defaults. We will use it to get the top rows from each +table by applying the head() function to each element of +the list. The results are returned as a new list.

+ + + +
purrr::map(
+  as.list(markers[1:3]), # select the first 3 clusters and convert to a 'regular' list for purrr
+  head # the function to apply (note no parenthesis)
+  )
+ + + +

This returns a list of data frames, which isn’t quite what we +want.

+

There is no built-in function that will give us just the first few +row names, so we will have to define one. As of version 4.1, R +introduced a new approach to defining anonymous functions - +that is, functions you can quickly define “on-the-fly” without formally +assigning them to a function name. They are handy when you need to do a +very short task that requires a function, but it isn’t really a function +you need beyond this context. This new anonymous syntax looks like this: +\(x)... (or for slightly longer code, use curly braces as +in \(x) {...}). This defines a function that takes one +argument, x, with ... indicating where you +would put the expression to calculate.

+

purrr::map() will then apply the expression in our +anonymous function to each element of the list, and return the results +as a new list.

+ + + +
# Get the first few row names of each table with a purrr function.
+purrr::map(
+  # convert markers to a 'regular' list for purrr
+  as.list(markers),
+  # our custom function:
+  \(x) head( rownames(x) )
+)
+ + +
$`1`
+[1] "ENSG00000153064" "ENSG00000247982" "ENSG00000042980" "ENSG00000224137"
+[5] "ENSG00000211898" "ENSG00000163534"
+
+$`2`
+[1] "ENSG00000213809" "ENSG00000172543" "ENSG00000153563" "ENSG00000104660"
+[5] "ENSG00000164120" "ENSG00000182871"
+
+$`3`
+[1] "ENSG00000124882" "ENSG00000137462" "ENSG00000136689" "ENSG00000103569"
+[5] "ENSG00000123689" "ENSG00000043462"
+
+$`4`
+[1] "ENSG00000120875" "ENSG00000115935" "ENSG00000100629" "ENSG00000136754"
+[5] "ENSG00000082074" "ENSG00000163599"
+
+$`5`
+[1] "ENSG00000229117" "ENSG00000231500" "ENSG00000147403" "ENSG00000109475"
+[5] "ENSG00000105372" "ENSG00000156508"
+
+$`6`
+[1] "ENSG00000266088" "ENSG00000235576" "ENSG00000204866" "ENSG00000156234"
+[5] "ENSG00000174946" "ENSG00000111796"
+
+$`7`
+[1] "ENSG00000111678" "ENSG00000275385" "ENSG00000173369" "ENSG00000164754"
+[5] "ENSG00000159189" "ENSG00000197249"
+
+$`8`
+[1] "ENSG00000164236" "ENSG00000054219" "ENSG00000128487" "ENSG00000086758"
+[5] "ENSG00000181163" "ENSG00000133112"
+
+$`9`
+[1] "ENSG00000156508" "ENSG00000205542" "ENSG00000171863" "ENSG00000181163"
+[5] "ENSG00000186468" "ENSG00000145592"
+
+$`10`
+[1] "ENSG00000008517" "ENSG00000128340" "ENSG00000026025" "ENSG00000092820"
+[5] "ENSG00000102879" "ENSG00000054267"
+
+$`11`
+[1] "ENSG00000141753" "ENSG00000085063" "ENSG00000148175" "ENSG00000115306"
+[5] "ENSG00000182871" "ENSG00000085733"
+
+$`12`
+[1] "ENSG00000132465" "ENSG00000185507" "ENSG00000156675" "ENSG00000051108"
+[5] "ENSG00000135916" "ENSG00000101057"
+ + +
ENSG00000153064
+ENSG00000247982
+ENSG00000042980
+ENSG00000224137
+ENSG00000211898
+ENSG00000163534
+ + +
ENSG00000213809
+ENSG00000172543
+ENSG00000153563
+ENSG00000104660
+ENSG00000164120
+ENSG00000182871
+ + +
ENSG00000124882
+ENSG00000137462
+ENSG00000136689
+ENSG00000103569
+ENSG00000123689
+ENSG00000043462
+ + +
ENSG00000120875
+ENSG00000115935
+ENSG00000100629
+ENSG00000136754
+ENSG00000082074
+ENSG00000163599
+ + +
ENSG00000229117
+ENSG00000231500
+ENSG00000147403
+ENSG00000109475
+ENSG00000105372
+ENSG00000156508
+ + +
ENSG00000266088
+ENSG00000235576
+ENSG00000204866
+ENSG00000156234
+ENSG00000174946
+ENSG00000111796
+ + +
ENSG00000111678
+ENSG00000275385
+ENSG00000173369
+ENSG00000164754
+ENSG00000159189
+ENSG00000197249
+ + +
ENSG00000164236
+ENSG00000054219
+ENSG00000128487
+ENSG00000086758
+ENSG00000181163
+ENSG00000133112
+ + +
ENSG00000156508
+ENSG00000205542
+ENSG00000171863
+ENSG00000181163
+ENSG00000186468
+ENSG00000145592
+ + +
ENSG00000008517
+ENSG00000128340
+ENSG00000026025
+ENSG00000092820
+ENSG00000102879
+ENSG00000054267
+ + +
ENSG00000141753
+ENSG00000085063
+ENSG00000148175
+ENSG00000115306
+ENSG00000182871
+ENSG00000085733
+ + +
ENSG00000132465
+ENSG00000185507
+ENSG00000156675
+ENSG00000051108
+ENSG00000135916
+ENSG00000101057
+ + + +

Another variant is purrr::imap(), which allows us to use +the names of the list elements in our function. (Try +names(markers) to see the names for the list we are working +with now.) We will use that here to name output files where we will +print each of the marker tables, one for each cell type. We are again +defining a custom function within the call to purrr:imap() +using the \(x) syntax, but this time we need two variables: +we will use table for the list elements (each a table of +results) and id for their names. So, we’ll actually start +by defining the function as \(table, id), since there will +be two input arguments. Because we don’t know the identities of the +clusters we identified, these are just the cluster numbers for now.

+

Making file names from numbers can be a a bit fraught, as we really +want them to sort in numerical order, but many systems will sort by +alphabetical order. Unfortunately, that would tend to sort 10-19 before +2, 20-29 before 3, etc. To solve this, we are using the +sprintf() function, which allows us to specify the format +of a printed string. In this case, we are using the formatting syntax of +%02d to tell it that we will want to insert +(%) a number (d), with two digits and leading +zeros. To see what this does a bit more concretely, let’s look at a +simple example:

+ + + +
sprintf("%02d", 1:10)
+ + +
 [1] "01" "02" "03" "04" "05" "06" "07" "08" "09" "10"
+ + +
01
+02
+03
+04
+05
+06
+07
+08
+09
+10
+ + + +

In addition to writing the tables out, we are saving the data frames +we created as a new list that we can use in the next step.

+ + + +
marker_df_list <- purrr::imap(
+  as.list(markers), # convert markers to a 'regular' list for purrr
+  # purrr function: x is the list element, y is the element name (number here)
+  \(table, id) {
+    as.data.frame(table) |> # first convert to a data frame
+      tibble::rownames_to_column("gene") |> # make genes a column
+      dplyr::arrange(FDR) |> # sort to be sure small FDR genes are first
+      readr::write_tsv( # write each data frame to a file
+        file.path(
+          marker_dir, # construct the output path
+          sprintf("cluster%02d_markers.tsv", as.integer(id)) # format cluster numbers in file names with leading zeros
+        )
+      )
+  }
+)
+ + + +
+

Plotting marker gene expression

+

One thing we can do with this list of marker genes is to see how they +look across the cells and clusters. The +scater::plotReducedDim() function makes this easy! We have +earlier colored points by some cell statistic, like the number of +expressed genes, but it is just as easy to color by the expression of a +single gene by using the gene identifier as the color_by +argument.

+

The first step is to get the gene information for the genes we might +be interested in.

+ + + +
# get gene ids for top 10 cluster 1 markers
+gene_ids <- marker_df_list[[1]] |>
+  head(n = 10) |>
+  dplyr::pull(gene)
+
+# look at the gene info for these
+gene_info <- rowData(hodgkins_sce)[gene_ids, ]
+data.frame(gene_info)
+ +
+ +
+ + +

Now we can pick one of the genes for plotting and go!

+ + + +
# get gene id and gene symbol for nicer plotting
+rank <- 1
+gene_id <- gene_info$ID[rank]
+symbol <- gene_info$Symbol[rank]
+
+# Plot UMAP results colored by expression
+plotReducedDim(hodgkins_sce, "UMAP",
+               color_by = gene_id) +
+  # label the guide with the gene symbol
+  guides(color = guide_colorbar(title = symbol))
+ + +

+ + + +

Hopefully that expression pattern aligns at least in part with your +expectations!

+
+
+
+

Next steps

+

So far we have identified clusters of cells (if you believe them), +and found some genes that are associated with each cluster. What you +might want to know at this point is what cell types comprise +each cluster. Setting aside the thorny question of “what is a cell +type?”, this is still a challenging problem, and we’ll explore some +approaches to perform cell type annotation in the next notebook!

+
+
+

Session Info

+ + + +
sessionInfo()
+ + +
R version 4.4.0 (2024-04-24)
+Platform: x86_64-pc-linux-gnu
+Running under: Ubuntu 22.04.4 LTS
+
+Matrix products: default
+BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+
+locale:
+ [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
+ [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
+ [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
+ [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
+ [9] LC_ADDRESS=C               LC_TELEPHONE=C            
+[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
+
+time zone: Etc/UTC
+tzcode source: system (glibc)
+
+attached base packages:
+[1] stats4    stats     graphics  grDevices utils     datasets  methods  
+[8] base     
+
+other attached packages:
+ [1] bluster_1.14.0              scran_1.32.0               
+ [3] scater_1.32.0               scuttle_1.14.0             
+ [5] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0
+ [7] Biobase_2.64.0              GenomicRanges_1.56.0       
+ [9] GenomeInfoDb_1.40.0         IRanges_2.38.0             
+[11] S4Vectors_0.42.0            BiocGenerics_0.50.0        
+[13] MatrixGenerics_1.16.0       matrixStats_1.3.0          
+[15] ggplot2_3.5.1               optparse_1.7.5             
+
+loaded via a namespace (and not attached):
+ [1] gridExtra_2.3             rlang_1.1.3              
+ [3] magrittr_2.0.3            compiler_4.4.0           
+ [5] DelayedMatrixStats_1.26.0 vctrs_0.6.5              
+ [7] stringr_1.5.1             pkgconfig_2.0.3          
+ [9] crayon_1.5.2              fastmap_1.1.1            
+[11] XVector_0.44.0            labeling_0.4.3           
+[13] utf8_1.2.4                rmarkdown_2.26           
+[15] tzdb_0.4.0                UCSC.utils_1.0.0         
+[17] ggbeeswarm_0.7.2          bit_4.0.5                
+[19] purrr_1.0.2               xfun_0.43                
+[21] zlibbioc_1.50.0           cachem_1.0.8             
+[23] beachmat_2.20.0           jsonlite_1.8.8           
+[25] highr_0.10                DelayedArray_0.30.0      
+[27] BiocParallel_1.38.0       irlba_2.3.5.1            
+[29] parallel_4.4.0            cluster_2.1.6            
+[31] R6_2.5.1                  bslib_0.7.0              
+[33] stringi_1.8.3             limma_3.60.0             
+[35] jquerylib_0.1.4           Rcpp_1.0.12              
+[37] knitr_1.46                readr_2.1.5              
+[39] Matrix_1.7-0              igraph_2.0.3             
+[41] tidyselect_1.2.1          abind_1.4-5              
+[43] yaml_2.3.8                viridis_0.6.5            
+[45] codetools_0.2-20          lattice_0.22-6           
+[47] tibble_3.2.1              withr_3.0.0              
+[49] evaluate_0.23             getopt_1.20.4            
+[51] pillar_1.9.0              generics_0.1.3           
+[53] vroom_1.6.5               hms_1.1.3                
+[55] sparseMatrixStats_1.16.0  munsell_0.5.1            
+[57] scales_1.3.0              glue_1.7.0               
+[59] metapod_1.12.0            tools_4.4.0              
+[61] BiocNeighbors_1.22.0      ScaledMatrix_1.12.0      
+[63] locfit_1.5-9.9            fs_1.6.4                 
+[65] cowplot_1.1.3             grid_4.4.0               
+[67] edgeR_4.2.0               colorspace_2.1-0         
+[69] GenomeInfoDbData_1.2.12   beeswarm_0.4.0           
+[71] BiocSingular_1.20.0       vipor_0.4.7              
+[73] cli_3.6.2                 rsvd_1.0.5               
+[75] fansi_1.0.6               S4Arrays_1.4.0           
+[77] viridisLite_0.4.2         dplyr_1.1.4              
+[79] gtable_0.3.5              sass_0.4.9               
+[81] digest_0.6.35             SparseArray_1.4.0        
+[83] ggrepel_0.9.5             dqrng_0.3.2              
+[85] farver_2.1.1              htmltools_0.5.8.1        
+[87] lifecycle_1.0.4           httr_1.4.7               
+[89] statmod_1.5.0             bit64_4.0.5              
+ + +
+ +
LS0tCnRpdGxlOiAiQ2x1c3RlcmluZyBjZWxscyBhbmQgZmluZGluZyBtYXJrZXIgZ2VuZXMgZnJvbSBzY1JOQS1zZXEgZGF0YSIKYXV0aG9yOiBDQ0RMIGZvciBBTFNGCmRhdGU6IDIwMjEKb3V0cHV0OgogIGh0bWxfbm90ZWJvb2s6CiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCiMjIE9iamVjdGl2ZXMKClRoaXMgbm90ZWJvb2sgd2lsbCBkZW1vbnN0cmF0ZSBob3cgdG86CgotIElkZW50aWZ5IGNsdXN0ZXJzIG9mIGNlbGxzIGluIHNpbmdsZS1jZWxsIGRhdGEKLSBDb21wYXJlIHJlc3VsdHMgZnJvbSBkaWZmZXJlbnQgY2x1c3RlcmluZyBtZXRob2RzCi0gU2VsZWN0IHB1dGF0aXZlIG1hcmtlciBnZW5lcyB0aGF0IGNhbiBiZSB1c2VkIHRvIGRpZmZlcmVudGlhdGUgY2x1c3RlcnMKCi0tLQoKIyMgU2V0IFVwCgojIyMgTG9hZCBsaWJyYXJpZXMKYGBge3Igc2V0dXB9CiMgTG9hZCBsaWJyYXJpZXMKbGlicmFyeShnZ3Bsb3QyKQpsaWJyYXJ5KHNjYXRlcikKbGlicmFyeShzY3JhbikKCiMgY2x1c3RlcmluZyB0b29scwpsaWJyYXJ5KGJsdXN0ZXIpCgojIFNldHRpbmcgdGhlIHNlZWQgZm9yIHJlcHJvZHVjaWJpbGl0eQpzZXQuc2VlZCgxMjM0NSkKYGBgCgpgYGB7ciBmaWxlcGF0aHN9CiMgbWFpbiBkYXRhIGRpcmVjdG9yeQpkYXRhX2RpciA8LSBmaWxlLnBhdGgoImRhdGEiLCAiaG9kZ2tpbnMiKQoKIyBub3JtYWxpemVkIGRhdGEgZmlsZQpub3JtYWxpemVkX3JkcyA8LSBmaWxlLnBhdGgoZGF0YV9kaXIsICJub3JtYWxpemVkIiwgIm5vcm1hbGl6ZWRfaG9kZ2tpbnNfc2NlLnJkcyIpCgojIE91dHB1dCBkaXJlY3RvcnkgZm9yIG1hcmtlcnMKbWFya2VyX2RpciA8LSBmaWxlLnBhdGgoImFuYWx5c2lzIiwgImhvZGdraW5zIiwgIm1hcmtlcnMiKQpmczo6ZGlyX2NyZWF0ZShtYXJrZXJfZGlyKQpgYGAKCmBgYHtyIHJlYWRfZGF0YSwgbGl2ZSA9IFRSVUV9CmhvZGdraW5zX3NjZSA8LSByZWFkcjo6cmVhZF9yZHMobm9ybWFsaXplZF9yZHMpCmBgYAoKCgojIyBBc3NpZ25pbmcgY2VsbCBjbHVzdGVycwoKIVtSb2FkbWFwOiBDbHVzdGVyXShkaWFncmFtcy9yb2FkbWFwX3NpbmdsZV9jbHVzdGVyLnBuZykKCldoZW4gd2UgcGVyZm9ybWVkIGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbiBvbiBvdXIgc2luZ2xlIGNlbGwgZGF0YSwgd2UgY291bGQgc2VlIHZpc3VhbGx5IHRoYXQgdGhlIGNlbGxzIHRlbmRlZCBjbHVzdGVyIHRvZ2V0aGVyIGludG8gZ3JvdXBzLgpUbyB0aGUgZXh0ZW50IHRoYXQgc3VjaCBjbHVzdGVyaW5nIGlzIGEgcmVhbCBiaW9sb2dpY2FsIHBoZW5vbWVub24sIHJlcHJlc2VudGluZyBjZWxscyB3aXRoIHNpbWlsYXIgcGF0dGVybnMgb2YgZ2VuZSBleHByZXNzaW9uLCB3ZSBtaWdodCBsaWtlIHRvIGlkZW50aWZ5IGRpc3RpbmN0IGdyb3VwcyB0aGF0IHdlIGNhbiBuYW1lIGFuZCBhc3NpZ24gYSBsYWJlbCB0by4KVWx0aW1hdGVseSwgd2Ugd291bGQgaG9wZSB0aGF0IHRoZXNlIGxhYmVscyBjb3JyZXNwb25kIHRvIHByZXZpb3VzbHkgaWRlbnRpZmllZCAob3IgbmV3bHkgaWRlbnRpZmllZCEpIGNlbGwgdHlwZXMsIGFuZCB0aGF0IHdlIGNhbiB1c2UgdGhhdCBpbmZvcm1hdGlvbiB0byBwcm92aWRlIG1vcmUgaW5zaWdodCBpbnRvIHRoZSByZXN1bHRzIG9mIG91ciBleHBlcmltZW50LgoKVGhlcmUgYXJlIGEgbnVtYmVyIG9mIG1ldGhvZHMgdG8gaWRlbnRpZnkgY2x1c3RlcnMgYW5kIGFzc2lnbiBjZWxscyB0byB0aG9zZSBpbiBtdWx0aWRpbWVuc2lvbmFsIGRhdGEgbGlrZSB0aGUgc2luZ2xlIGNlbGwgZGF0YSB3ZSBoYXZlLgpXZSB3aWxsIGV4cGxvcmUgYSBjb3VwbGUgb2YgdGhlIG1vcmUgY29tbW9uIG1ldGhvZHMgaGVyZS4KCiMjIyBrLW1lYW5zIGNsdXN0ZXJpbmcKClRoZSBmaXJzdCBtZXRob2Qgd2Ugd2lsbCB0cnkgaXMgay1tZWFucyBjbHVzdGVyaW5nLgpUaGUgYGtgIGhlcmUgcmVmZXJzIHRvIHRoZSBudW1iZXIgb2YgY2x1c3RlcnMgdGhhdCB3ZSB3aWxsIGNyZWF0ZSwgYW5kIG11c3QgYmUgY2hvc2VuIGJlZm9yZSB3ZSBzdGFydC4KVGhpcyBjbHVzdGVyaW5nIG1ldGhvZCBzZWVrcyB0byBmaW5kIGEgd2F5IHRvIGRpdmlkZSB0aGUgY2VsbHMgaW50byBga2AgY2x1c3RlcnMgc3VjaCB0aGF0IHRoZSBjZWxscyB3aXRoaW4gZWFjaCBjbHVzdGVyIGFyZSBhcyBzaW1pbGFyIGFzIHBvc3NpYmxlIGFuZCB0aGUgZGlmZmVyZW5jZXMgYW1vbmcgY2x1c3RlcnMgaXMgYXMgbGFyZ2UgYXMgcG9zc2libGUuCgpJdCB0dXJucyBvdXQgdGhhdCBpcyBhIHByZXR0eSBoYXJkIHByb2JsZW0gdG8gc29sdmUgZXhhY3RseSwgYnV0IHdlIGNhbiBkbyBwcmV0dHkgd2VsbCB3aXRoIGFuIGFsZ29yaXRobSB0aGF0IHN0YXJ0cyB3aXRoIGEgcmFuZG9tIGd1ZXNzIGF0IHdoZXJlIHRoZSBjbHVzdGVycyB3aWxsIGJlOgoKMS4gV2Ugc3RhcnQgYnkgcGlja2luZyByYW5kb20gY2VudGVyIGxvY2F0aW9ucyAoaG93IHdlIGRvIHRoaXMgY2FuIHZhcnkpCjIuIFRoZW4sIHdlIGFzc2lnbiBjZWxscyB0byBjbHVzdGVycyBieSBmaW5kaW5nIHdoaWNoIGNlbnRlciBpcyBjbG9zZXN0IHRvIGVhY2ggY2VsbC4KMy4gTmV4dCB3ZSBmaW5kIHRoZSBjZW50ZXJzIG9mIHRoZXNlIG5ldyBjbHVzdGVycwo0LiBHbyBiYWNrIHRvIHN0ZXAgMiB3aXRoIHRoZXNlIG5ldyBjZW50ZXJzLCByZXBlYXRpbmcgdW50aWwgdGhlIGNsdXN0ZXIgYXNzaWdubWVudHMgc3RvcCBjaGFuZ2luZy4KCllvdSBtaWdodCB3b25kZXI6IEhvdyBtYW55IGNsdXN0ZXJzIHNob3VsZCB3ZSB1c2U/ClRoYXQgaXMgYSBoYXJkIHF1ZXN0aW9uIQpUaGVyZSBhcmUgc29tZSBoZXVyaXN0aWNzIHdlIGNhbiB1c2UgZm9yIGRlY2lkaW5nIHRoZSAiY29ycmVjdCIgbnVtYmVyIG9mIGNsdXN0ZXJzLCBidXQgd2Ugd2lsbCBub3QgYmUgZXhwbG9yaW5nIHRob3NlIHJpZ2h0IG5vdy4KCkZvciBhbiBpbnR1aXRpdmUgdmlzdWFsaXphdGlvbiBvZiB0aGUgZ2VuZXJhbCBrLW1lYW5zIG1ldGhvZCwgeW91IG1pZ2h0IGZpbmQgW3RoaXMgU3RhdFF1ZXN0IHZpZGVvXShodHRwczovL3d3dy55b3V0dWJlLmNvbS93YXRjaD92PTRiNWQzbXVQUW1BKSB1c2VmdWwsIGFuZCBmb3IgbW9yZSBkaXNjdXNzaW9uIG9mIHRoZSBtZXRob2QgaW4gYSBzaW5nbGUtY2VsbCBjb250ZXh0LCB0aGUgW09yY2hlc3RyYXRpbmcgU2luZ2xlLUNlbGwgQW5hbHlzaXMgYm9vayBzZWN0aW9uIG9uIGstbWVhbnNdKGh0dHBzOi8vYmlvY29uZHVjdG9yLm9yZy9ib29rcy8zLjE2L09TQ0EuYmFzaWMvY2x1c3RlcmluZy5odG1sI3ZlY3Rvci1xdWFudGl6YXRpb24td2l0aC1rLW1lYW5zKSBpcyBhIGdvb2QgcmVmZXJlbmNlLgoKV2UgYXJlIGdvaW5nIHRvIHVzZSB0aGUgZnVuY3Rpb24gYGNsdXN0ZXJSb3dzKClgIGZyb20gdGhlICBCaW9jb25kdWN0b3IgYGJsdXN0ZXJgIHBhY2thZ2UgZm9yIG91ciBjbHVzdGVyaW5nLgpUaGlzIGZ1bmN0aW9uIHRha2VzIGEgbWF0cml4IHdoZXJlIGVhY2ggc2FtcGxlIChjZWxsIGluIG91ciBjYXNlKSBpcyBhIHJvdyBhbmQgZWFjaCBjb2x1bW4gaXMgYSBmZWF0dXJlLgpUaGUgbWF0cml4IG9mIGNvdW50cyAob3Igbm9ybWFsaXplZCBjb3VudHMpIGJ5IGNlbGxzIGluIG91ciBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCBpcyB0aGUgd3Jvbmcgb3JpZW50YXRpb24sIHNvIGF0IGEgbWluaW11bSB3ZSB3b3VsZCBoYXZlIHRvIHRyYW5zcG9zZSB0aGF0IG1hdHJpeCBiZWZvcmUgcHJvY2VlZGluZy4KCkhvd2V2ZXIsIGNsdXN0ZXJpbmcgYWxnb3JpdGhtcyBsaWtlIGstbWVhbnMgY2FuIGJlIGEgYml0IHNsb3cgd2l0aCBhcyBtYW55IGZlYXR1cmVzIGFzIHRoZSBudW1iZXIgb2YgZ2VuZXMgdGhhdCB3ZSBoYXZlIGluIG91ciBkYXRhIHNldCwgc28gd2Ugd291bGQgcmF0aGVyIG5vdCB1c2UgdGhlIHJhdyBkYXRhLgpUaGVyZSBpcyBhbHNvIGEgcG90ZW50aWFsIGNvbmNlcm4gdGhhdCBub2lzZSBpbiB0aGUgcmF3IGRhdGEgbWlnaHQgZGlzcnVwdCB0aGUgY2x1c3RlcmluZyBhbGdvcml0aG0sIHNvIGl0IHdvdWxkIGJlIGJlc3QgdG8gdXNlIHNvbWUga2luZCBvZiBkaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24gYWxnb3JpdGhtIGZpcnN0LgpXZSBzdGlsbCB3YW50IHRvIG1haW50YWluIGEgZ29vZCBudW1iZXIgb2YgZGltZW5zaW9ucywgc28gb3VyIG9sZCBmcmllbmQgUENBIGlzIGEgZ29vZCAoYW5kIHZlcnkgc3RhbmRhcmQpIGNob2ljZS4KClRoYW5rZnVsbHksIHdlIGFscmVhZHkgY29tcHV0ZWQgKmFuZCBzdG9yZWQqIGEgbWF0cml4IHdpdGggcmVkdWNlZCBkaW1lbnNpb25zIHdpdGggdGhlIGBydW5QQ0EoKWAgZnVuY3Rpb24uCldlIHdpbGwgZXh0cmFjdCB0aGF0IGZyb20gdGhlIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgb2JqZWN0IHdpdGggdGhlIGByZWR1Y2VkRGltKClgIGZ1bmN0aW9uLCB3aGljaCBjb252ZW5pZW50bHkgcmV0dXJucyBhIG1hdHJpeCB3aXRoIHRoZSBjZWxscyBhcyByb3dzLCBzbyB3ZSBjYW4gdXNlIHRoYXQgZGlyZWN0bHkhCgpUaGUgb3RoZXIgYXJndW1lbnQgd2UgbmVlZCBmb3IgYGNsdXN0ZXJSb3dzKClgIHdpbGwgdGVsbCBpdCB3aGljaCBjbHVzdGVyaW5nIGFsZ29yaXRobSB0byB1c2UsIGFuZCBhbnkgYWRkaXRpb25hbCBwYXJhbWV0ZXJzIGFzc29jaWF0ZWQgd2l0aCB0aGF0IGFsZ29yaXRobS4KVGhpcyBoYXMgdG8gYmUgbWFkZSB3aXRoIGEgc3BlY2lhbCBraW5kIG9mIGZ1bmN0aW9uIGZyb20gdGhlIGBibHVzdGVyYCBwYWNrYWdlLgpJbiB0aGlzIGNhc2UsIHdlIHdpbGwgdXNlIGBLbWVhbnNQYXJhbSgpYCB0byBzcGVjaWZ5IHRoYXQgd2Ugd2FudCBrLW1lYW5zIGNsdXN0ZXJpbmcsIHdpdGggdGhlIGBjZW50ZXJzYCBwYXJhbWV0ZXIgdG8gc2V0IGhvdyBtYW55IGNsdXN0ZXJzIHdlIHdpbGwgYXNzaWduIChga2ApLgoKYGBge3Iga21lYW5zXzcsIGxpdmUgPSBUUlVFfQojIHNldCB0aGUgbnVtYmVyIG9mIGNsdXN0ZXJzCmsgPC0gNwoKIyBleHRyYWN0IHRoZSBwcmluY2lwYWwgY29tcG9uZW50cyBtYXRyaXgKaG9kZ2tpbnNfcGNhIDwtIHJlZHVjZWREaW0oaG9kZ2tpbnNfc2NlLCAiUENBIikKCiMgcGVyZm9ybSB0aGUgY2x1c3RlcmluZwprY2x1c3RlcnMgPC0gY2x1c3RlclJvd3MoaG9kZ2tpbnNfcGNhLCBLbWVhbnNQYXJhbShjZW50ZXJzID0gaykpCmBgYAoKVGhlIGBjbHVzdGVyUm93cygpYCBmdW5jdGlvbiByZXR1cm5lZCBhIHZlY3RvciBvZiBjbHVzdGVyIGFzc2lnbm1lbnRzIGFzIGludGVnZXJzLCBidXQgdGhlIG51bWVyaWNhbCB2YWx1ZXMgaGF2ZSBubyBpbmhlcmVudCBtZWFuaW5nLgpGb3IgcGxvdHRpbmcgd2Ugd2lsbCB3YW50IHRvIGNvbnZlcnQgdGhvc2UgdG8gYSBmYWN0b3IsIHNvIFIgaXMgbm90IHRlbXB0ZWQgdG8gdHJlYXQgdGhlbSBhcyBhIGNvbnRpbnVvdXMgdmFyaWFibGUuCgpXZSBjYW4gYWxzbyBzdG9yZSB0aGVtIGJhY2sgaW50byB0aGUgY29sdW1uIChjZWxsKSBpbmZvcm1hdGlvbiB0YWJsZSBvZiB0aGUgb3JpZ2luYWwgb2JqZWN0IGZvciBjb252ZW5pZW50IHN0b3JhZ2UgYW5kIGxhdGVyIHVzZS4KCmBgYHtyIHN0b3JlX2tjbHVzdGVycywgbGl2ZSA9IFRSVUV9CiMgc2F2ZSBjbHVzdGVycyBpbiB0aGUgU0NFIG9iamVjdCBhcyBhIGZhY3Rvcgpob2Rna2luc19zY2Uka2NsdXN0ZXIgPC0gZmFjdG9yKGtjbHVzdGVycykKYGBgCgpOb3cgd2UgY2FuIHBsb3QgdGhlIHJlc3VsdHMgYW5kIHNlZSBob3cgdGhlIGNsdXN0ZXJpbmcgbG9va3MsIHVzaW5nIHRoZSBgc2NhdGVyYCBmdW5jdGlvbiBgcGxvdFJlZHVjZWREaW0oKWAgdGhhdCB3ZSBoYXZlIHVzZWQgYmVmb3JlLCBjb2xvcmluZyB0aGUgcG9pbnRzIGJ5IG91ciBjbHVzdGVyaW5nIHJlc3VsdHMuCldlIHdpbGwgc3RhcnQgYnkgdXNpbmcgdGhlIFVNQVAgY29vcmRpbmF0ZXMgZm9yIHRoZSBwbG90LgpOb3RlIHRoYXQgdGhpcyBkb2VzIHJlcXVpcmUgdGhhdCB0aGUgY2x1c3RlciBpZGVudGl0aWVzIHdlcmUgc3RvcmVkIGluIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIG9iamVjdCwgYXMgd2UganVzdCBkaWQuCgpgYGB7ciBwbG90X2ssIGxpdmUgPSBUUlVFfQojIHBsb3QgY2x1c3RlcmluZyByZXN1bHRzCnBsb3RSZWR1Y2VkRGltKGhvZGdraW5zX3NjZSwgIlVNQVAiLCBjb2xvcl9ieSA9ICJrY2x1c3RlciIpCmBgYAoKLSBEbyB0aG9zZSBjbHVzdGVycyBsaW5lIHVwIHdpdGggd2hhdCB5b3UgbWlnaHQgaGF2ZSBleHBlY3RlZCBpZiB5b3Ugd2VyZSBkb2luZyB0aGlzIGJ5IGV5ZT8KLSBJZiB3ZSByZXBlYXQgdGhpcywgZG8gd2UgZ2V0IHRoZSBzYW1lIGNsdXN0ZXIgYXNzaWdubWVudHM/Ci0gV2hhdCBoYXBwZW5zIGlmIHdlIGNoYW5nZSB0aGUgbnVtYmVyIG9mIGNsdXN0ZXJzPwotIFdoYXQgZG8gdGhlIHJlc3VsdHMgbG9vayBsaWtlIGlmIHlvdSBwbG90IHdpdGggdGhlIGBQQ0FgIG9yIGBUU05FYCBjb29yZGluYXRlcz8KCllvdSB3aWxsIGhhdmUgdGltZSB0byBleHBsb3JlIHF1ZXN0aW9ucyBsaWtlIHRoZXNlIGluIHRoZSBleGVyY2lzZSBub3RlYm9va3MuCk9uZSB0aGluZyB3b3J0aCBub3RpbmcgcmlnaHQgYXdheSB0aG91Z2ggaXMgdGhhdCBjbHVzdGVyIG51bWJlcnMgaGVyZSBhbmQgaW4gdGhlIGZ1dHVyZSBhcmUgYXNzaWduZWQgYXJiaXRyYXJpbHkuCkV2ZW4gaWYgd2UgZ290IGV4YWN0bHkgdGhlIHNhbWUgbG9naWNhbCBjbHVzdGVycyBhY3Jvc3MgcnVucyAodW5saWtlbHkhKSwgd2Ugd291bGRuJ3QgZXhwZWN0IHRoZSBsYWJlbCBudW1iZXJzIHRvIGJlIHRoZSBzYW1lIG9yIHN0YWJsZS4KCiMjIyBHcmFwaC1iYXNlZCBjbHVzdGVyaW5nCgpBbm90aGVyIGNvbW1vbiB0eXBlIG9mIGNsdXN0ZXJpbmcgbWV0aG9kIGZvciBzaW5nbGUgY2VsbCBkYXRhIGlzIGdyYXBoLWJhc2VkIGNsdXN0ZXJpbmcuClRoaXMgYWxnb3JpdGhtIGZvbGxvd3MgdGhlIGZvbGxvd2luZyBnZW5lcmFsIHN0ZXBzOgoKMS4gSWRlbnRpZnlpbmcgYSBzZXQgb2YgbmVhcmVzdCBuZWlnaGJvcnMgZm9yIGVhY2ggY2VsbCB0aGF0IGhhdmUgc2ltaWxhciBleHByZXNzaW9uIHByb2ZpbGVzIHRvIHRoYXQgY2VsbC4KMi4gQ29ubmVjdCBlYWNoIGNlbGwgdG8gaXRzIG5laWdoYm9ycyBpbiBhIG5ldHdvcmsgZ3JhcGgsIHdlaWdodGluZyB0aGUgY29ubmVjdGlvbnMgYnkgaG93IHNpbWlsYXIgdGhlIGNvbm5lY3RlZCBjZWxscyBhcmUuCjMuIEJyZWFrIHRoZSBuZXR3b3JrIHVwIGJ5IGlkZW50aWZ5aW5nIGNsdXN0ZXJzIG9mIGNlbGxzIHRoYXQgYXJlIG1vcmUgY29ubmVjdGVkIHRvIGVhY2ggb3RoZXIgdGhhbiB0aGV5IGFyZSB0byBjZWxscyBvdXRzaWRlIHRoZSBjbHVzdGVycy4KClRoZXJlIGlzIGEgbG90IG9mIGhpZGRlbiBkZXRhaWwgaW4gdGhvc2UgdGhyZWUgc3RlcHMhCgpUbyBhcHBseSB0aGlzIGNsdXN0ZXJpbmcgYWxnb3JpdGhtLCB3ZSB3aWxsIHVzZSB0aGUgc2FtZSBgYmx1c3Rlcjo6Y2x1c3RlclJvd3MoKWAgZnVuY3Rpb24gYXMgYmVmb3JlLCBidXQgd2Ugd2lsbCBjaGFuZ2UgdGhlIHNlY29uZCBhcmd1bWVudCBmcm9tIGBLbWVhbnNQYXJhbSgpYCB0byBgTk5HcmFwaFBhcmFtKClgIHRvIHRlbGwgaXQgdGhhdCB3ZSB3YW50IHRvIHVzZSBhIG5lYXJlc3QtbmVpZ2hib3IgZ3JhcGgtYmFzZWQgbWV0aG9kLgpXZSBjYW4gdGhlbiBzdXBwbHkgYWRkaXRpb25hbCBwYXJhbWV0ZXJzIHRvIGBOTkdyYXBoUGFyYW0oKWAgdG8gYWRqdXN0IHRoZSBkZXRhaWxzIG9mIHRoZSBhbGdvcml0aG0uCkhlcmUgd2Ugd2lsbCB1c2UgYGtgIHRvIHNwZWNpZnkgdGhlIG51bWJlciBvZiBuZWlnaGJvcnMgdG8gdXNlIHdoZW4gYnVpbGRpbmcgdGhlIGdyYXBoIGFuZCBgY2x1c3Rlci5mdW5gIHRvIHNwZWNpZnkgdGhlIGFsZ29yaXRobSBmb3IgaWRlbnRpZnlpbmcgdGhlIGNsdXN0ZXJzIHdpdGhpbiB0aGUgZ3JhcGguCgotIERlc3BpdGUgc2hhcmluZyBhIGxldHRlciwgYGtgIGhlcmUgYW5kIHRoZSBvbmUgZnJvbSBrLW1lYW5zIGNsdXN0ZXJpbmcgYXJlIG5vdCB0aGUgc2FtZSB0aGluZyEKSW4gdGhpcyBjYXNlLCB3ZSBhcmUgdGVsbGluZyB0aGUgYWxnb3JpdGhtIGhvdyBtYW55IG5laWdoYm9yIGNvbm5lY3Rpb25zIHRvIG1ha2UgZm9yIGVhY2ggY2VsbCwgbm90IHRoZSBmaW5hbCBudW1iZXIgb2YgY2x1c3RlcnMsIHdoaWNoIHdpbGwgYmUgZGV0ZXJtaW5lZCBieSB0aGUgYWxnb3JpdGhtIHdlIHVzZSBmb3IgdGhlIGNsdXN0ZXIgYnVpbGRpbmcgc3RlcC4KCi0gVGhlIG9wdGlvbnMgZm9yIGBjbHVzdGVyLmZ1bmAgZGVzY3JpYmUgdGhlIGFsZ29yaXRobSBmb3IgdGhlIGNsdXN0ZXIgYnVpbGRpbmcgc3RlcCBkZXNjcmliZWQgYWJvdmUuIFRoZXNlIGluY2x1ZGUgYHdhbGt0cmFwYCAodGhlIGRlZmF1bHQpLCBgbGVpZGVuYCwgYW5kIGBsb3V2YWluYCwgd2hpY2ggaXMgdGhlIGRlZmF1bHQgYWxnb3JpdGhtIGluIFtgU2V1cmF0YF0oaHR0cHM6Ly9zYXRpamFsYWIub3JnL3NldXJhdC8pLCBhbm90aGVyIGNvbW1vbiBwYWNrYWdlIGZvciBzaW5nbGUgY2VsbCBhbmFseXNpcyB0aGF0IHlvdSBtYXkgaGF2ZSBzZWVuLgoKSW4gdGhlIGV4YW1wbGUgYmVsb3csIHdlIHdpbGwgdXNlIHRoZSBkZWZhdWx0IHZhbHVlcyBmb3IgdGhlc2UgdHdvIGFyZ3VtZW50cy4KCmBgYHtyIG5uY2x1c3QsIGxpdmUgPSBUUlVFfQojIHJ1biB0aGUgY2x1c3RlcmluZyBhbGdvcml0aG0Kbm5jbHVzdGVycyA8LSBjbHVzdGVyUm93cygKICBob2Rna2luc19wY2EsCiAgTk5HcmFwaFBhcmFtKGsgPSAxMCwKICAgICAgICAgICAgICAgY2x1c3Rlci5mdW4gPSAid2Fsa3RyYXAiKQogICkKIyBzdG9yZSBjbHVzdGVyIHJlc3VsdHMgaW4gdGhlIFNDRSBvYmplY3QKaG9kZ2tpbnNfc2NlJG5uY2x1c3RlciA8LSBmYWN0b3Iobm5jbHVzdGVycykKYGBgCgpOb3cgd2UgY2FuIHBsb3QgdGhlIHJlc3VsdHMgb2Ygb3VyIGdyYXBoLWJhc2VkIGNsdXN0ZXJpbmcuClRoaXMgdGltZSB3ZSB3aWxsIGFsc28gdXNlIHRoZSBgdGV4dF9ieWAgYXJndW1lbnQgdG8gaW5jbHVkZSB0aGUgY2x1c3RlciBpZHMgZGlyZWN0bHkgb24gdGhlIHBsb3QuCgpgYGB7ciBwbG90X25uY2x1c3QsIGxpdmUgPSBUUlVFfQpwbG90UmVkdWNlZERpbShob2Rna2luc19zY2UsCiAgICAgICAgICAgICAgICJVTUFQIiwKICAgICAgICAgICAgICAgY29sb3JfYnkgPSAibm5jbHVzdGVyIiwKICAgICAgICAgICAgICAgdGV4dF9ieSA9ICJubmNsdXN0ZXIiKQpgYGAKCi0gSG93IGRvIHRoZXNlIHJlc3VsdHMgY29tcGFyZSB0byB0aGUgay1tZWFucyBjbHVzdGVyaW5nIHJlc3VsdD8KLSBIb3cgc2Vuc2l0aXZlIGlzIHRoaXMgdG8gdGhlIHBhcmFtZXRlcnMgd2UgY2hvb3NlPwotIEhvdyBkbyB0aGUgbnVtYmVycyBvZiBjbHVzdGVycyBjaGFuZ2Ugd2l0aCBkaWZmZXJlbnQgcGFyYW1ldGVycz8KCkFnYWluLCB5b3Ugd2lsbCBoYXZlIHRpbWUgdG8gZXhwbG9yZSB0aGVzZSBtb3JlIGluIHRoZSBleGVyY2lzZSBub3RlYm9vaywgYW5kIG9mIGNvdXJzZSB3aXRoIHlvdXIgb3duIGRhdGEhClNhZGx5LCB0aGVyZSBhcmUgbm90IGFsd2F5cyBnb29kIGFuc3dlcnMgdG8gd2hpY2ggc2V0IG9mIGluZmVycmVkIGNsdXN0ZXJzIGlzIGJlc3QhCldoaWNoIG1ldGhvZCBhbmQgcGFyYW1ldGVycyB5b3UgdXNlIG1heSBkZXBlbmQgb24gdGhlIGtpbmQgb2YgcXVlc3Rpb24geW91IGFyZSB0cnlpbmcgdG8gYW5zd2VyLgoKRm9yIG1vcmUgZGV0YWlsZWQgaW5mb3JtYXRpb24gYWJvdXQgdGhlIG1ldGhvZHMgcHJlc2VudGVkIGhlcmUsIGluY2x1ZGluZyBzb21lIHdheXMgdG8gYXNzZXNzIHRoZSAicXVhbGl0eSIgb2YgdGhlIGNsdXN0ZXJpbmcsIEkgZW5jb3VyYWdlIHlvdSB0byBleHBsb3JlIGF0IHRoZSByZWxldmFudCBjaGFwdGVyIG9mIHRoZSBbT3JjaGVzdHJhdGluZyBTaW5nbGUtQ2VsbCBBbmFseXNpcyBib29rXShodHRwczovL2Jpb2NvbmR1Y3Rvci5vcmcvYm9va3MvMy4xNi9PU0NBLmJhc2ljL2NsdXN0ZXJpbmcuaHRtbCNjbHVzdGVyaW5nLWdyYXBoKS4KQSByZWNlbnQgcmV2aWV3IGJ5IFtLaXNsZXYgKmV0IGFsLiogKDIwMTkpXShodHRwczovL2RvaS5vcmcvMTAuMTAzOC9zNDE1NzYtMDE4LTAwODgtOSkgYWxzbyBnb2VzIGludG8gc29tZSBkZXB0aCBhYm91dCB0aGUgZGlmZmVyZW5jZXMgYW1vbmcgYWxnb3JpdGhtcyBhbmQgdGhlIGdlbmVyYWwgY2hhbGxlbmdlcyBhc3NvY2lhdGVkIHdpdGggY2x1c3RlcmluZyBzaW5nbGUgY2VsbCBkYXRhLgoKIyMgSWRlbnRpZnlpbmcgbWFya2VyIGdlbmVzCgohW1JvYWRtYXA6IEZpbmQgbWFya2Vyc10oZGlhZ3JhbXMvcm9hZG1hcF9zaW5nbGVfZmluZG1hcmtlcnMucG5nKQoKQXNzaWduaW5nIGNsdXN0ZXJzIGlzIG5pY2UgZm9yIHZpc3VhbGl6YXRpb24sIGJ1dCB3ZSB3b3VsZCBhbHNvIGxpa2UgdG8gYmUgYWJsZSB0byBtb3ZlIHRvd2FyZCBhIGJpb2xvZ2ljYWwgaW50ZXJwcmV0YXRpb24gb2YgdGhlIGNsdXN0ZXJzIGFuZCBpZGVudGlmeWluZyB0aGUgY2VsbCB0eXBlcyBpbiBlYWNoIGNsdXN0ZXIuClRvIHRoYXQgZW5kLCB3ZSBjYW4gaWRlbnRpZnkgbWFya2VyIGdlbmVzIHRoYXQgYXJlIGRpZmZlcmVudGlhbGx5IGV4cHJlc3NlZCBhbW9uZyBjbHVzdGVycy4KCkl0IGlzIHdvcnRoIG5vdGluZyBoZXJlIHRoYXQgdGhlIHN0YXRpc3RpY2FsIGNhbGN1bGF0aW9ucyBoZXJlIGFyZSBtb3JlIHRoYW4gYSBiaXQgY2lyY3VsYXI6IHdlIGlkZW50aWZpZWQgY2x1c3RlcnMgZmlyc3QgYmFzZWQgb24gZ2VuZSBleHByZXNzaW9uLCB0aGVuIHdlIGFyZSB1c2luZyB0aG9zZSBzYW1lIGNsdXN0ZXJzIHRvIGZpbmQgZGlmZmVyZW5jZXMgaW4gZ2VuZSBleHByZXNzaW9uLgpUaGUgcmVzdWx0IGlzIHRoYXQgZXZlbiBpZiB0aGVyZSB3ZXJlIG5vICp0cnVlKiBjbHVzdGVycywgd2Ugd291bGQgYWx3YXlzIGZpbmQgbWFya2VyIGdlbmVzIQpGb3IgYSBtdWNoIG1vcmUgdGVjaG5pY2FsIGV4cGxvcmF0aW9uIG9mIHRoaXMgY2lyY3VsYXJpdHkgKGFuZCBhIG1ldGhvZCB0byBjb3JyZWN0IGZvciBpdCksIHNlZSBhIHByZXByaW50IGJ5IFtHYW8gZXQgYWwuICgyMDIwKV0oaHR0cHM6Ly9hcnhpdi5vcmcvYWJzLzIwMTIuMDI5MzYpLgpJbiBsaWdodCBvZiB0aGlzLCBpdCBpcyBiZXR0ZXIgdG8gdGhpbmsgYWJvdXQgbWFya2VyIGdlbmUgaWRlbnRpZmljYXRpb24gYXMgYW4gYWlkIGluIGludGVycHJldGluZyB0aGUgY2x1c3RlcmluZyByZXN1bHRzIChhbmQgcG9zc2libHkgZXh0ZW5kaW5nIGluc2lnaHRzIHRvIG5ldyBkYXRhIHNldHMpLCByYXRoZXIgdGhhbiByZXN1bHRzIHRoYXQgc2hvdWxkIGJlIGludGVycHJldGVkIG9uIHRoZWlyIG93biwgYW5kIHdlIHNob3VsZCBiZSBleHRyZW1lbHkgd2FyeSBvZiBqdXN0aWZ5aW5nIGNsdXN0ZXIgYXNzaWdubWVudHMgc29sZWx5IGJhc2VkIG9uIHRoZXNlIHJlc3VsdHMhCldpdGggdGhhdCBjYXZlYXQsIGxldCdzIHByb2NlZWQuCgpUbyBpZGVudGlmeSBtYXJrZXIgZ2VuZXMsIHdlIHdpbGwgdXNlIHRoZSBgc2NyYW46OmZpbmRNYXJrZXJzKClgIGZ1bmN0aW9uLCB3aGljaCB3aWxsIHJhbmsgZ2VuZXMgYnkgdGhlaXIgZGlmZmVyZW50aWFsIGV4cHJlc3Npb24gYnkgY2FsY3VsYXRpbmcgcGFpcndpc2Ugc3RhdGlzdGljcyBhbW9uZyBjbHVzdGVycy4KV2UgaGF2ZSBhIGZldyBvcHRpb25zIGZvciBob3cgdG8gZGV0ZXJtaW5lIHRoZSBnZW5lIHJhbmtpbmdzIGFuZCBtYXJrZXIgZ2VuZSBsaXN0IGZvciBlYWNoIGNsdXN0ZXIuCkF0IG9uZSBlbmQgY291bGQgaW5jbHVkZSBnZW5lcyB0aGF0IGFyZSBkaWZmZXJlbnRpYWxseSBleHByZXNzZWQgaW4gKmFueSogcGFpcndpc2UgY29tcGFyaXNvbiBhZ2FpbnN0IG91ciBmb2NhbCBjbHVzdGVyLCBvciBhdCB0aGUgb3RoZXIgd2UgY291bGQgb25seSBpbmNsdWRlIGdlbmVzIHRoYXQgYXJlIGRpZmZlcmVudGlhbGx5IGV4cHJlc3NlZCBpbiAqYWxsKiBjb21wYXJpc29ucyB3aXRoIHRoYXQgY2x1c3Rlci4KV2UgY291bGQgYWxzbyBkbyBzb21ldGhpbmcgaW4gYmV0d2VlbiwgaW5jbHVkaW5nIGdlbmVzIHRoYXQgZGlmZmVyZW50aWF0ZSB0aGUgZm9jYWwgY2x1c3RlciBmcm9tICpzb21lKiBmcmFjdGlvbiBvZiB0aGUgb3RoZXIgY2x1c3RlcnMuCkZvciBub3csIHdlIHdpbGwgdXNlIHRoZSBgZmluZE1hcmtlcnMoKWAgZnVuY3Rpb24gdG8gcmFuayB0aGUgZ2VuZXMgaW4gZWFjaCBjbHVzdGVyIGJ5IHRoZWlyIGNvbWJpbmVkIHNjb3JlcyBhZ2FpbnN0ICphbGwqIG90aGVyIGNsdXN0ZXJzLCB1c2luZyB0aGUgYHB2YWwudHlwZWAgYXJndW1lbnQuCgpgZmluZE1hcmtlcnMoKWAgd2lsbCByZXR1cm4gYSBsaXN0ICh0ZWNobmljYWxseSBhIGxpc3QtbGlrZSBvYmplY3QpIG9mIHRhYmxlcywgb25lIGZvciBlYWNoIGNlbGwgdHlwZSwgd2l0aCBzdGF0aXN0aWNzIGZvciBlYWNoIGdlbmUgc2hvd2luZyBob3cgd2VsbCBpdCBkaWZmZXJlbnRpYXRlcyB0aGF0IGNlbGwgdHlwZSBhZ2FpbnN0IG90aGVyIHR5cGVzLgoKCmBgYHtyIGZpbmRfbWFya2VycywgbGl2ZSA9IFRSVUV9CiMgdXNlIGBmaW5kTWFya2VycygpYCB0byBjYWxjdWxhdGUgaG93IHdlbGwgZWFjaCBnZW5lCiMgIGRpZmZlcmVudGlhdGVzIGVhY2ggY2x1c3RlciBmcm9tICphbGwqIG90aGVyIGNsdXN0ZXJzCm1hcmtlcnMgPC0gc2NyYW46OmZpbmRNYXJrZXJzKGhvZGdraW5zX3NjZSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZ3JvdXBzID0gaG9kZ2tpbnNfc2NlJG5uY2x1c3RlciwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgcHZhbC50eXBlID0gImFsbCIpCmBgYAoKTmV4dCB3ZSBjYW4gbG9vayBhdCBvbmUgb2YgdGhvc2UgdGFibGVzLgpXZSB3aWxsIHN0YXJ0IHdpdGggdGhlIGZpcnN0IGNsdXN0ZXIsIHdoaWNoIHdlIHdpbGwgc2VsZWN0IGZyb20gdGhlIGxpc3QgdXNpbmcgdGhlIFIgc3RhbmRhcmQgZG91YmxlIGJyYWNrZXQgYFtbMV1dYCBub3RhdGlvbi4KV2UgYWxzbyBkb2luZyBhIGJpdCBvZiB0cmFuc2Zvcm1hdGlvbiBoZXJlIHRvIHB1bGwgdGhlIGdlbmUgbmFtZSBpbnRvIGEgY29sdW1uIG9mIGl0cyBvd24uCgpgYGB7ciBtYXJrZXJfdGFibGV9Cm1hcmtlcnNbWzFdXSB8PgogIGFzLmRhdGEuZnJhbWUoKSB8PiAjIGNvbnZlcnQgdG8gYSBkYXRhIGZyYW1lCiAgdGliYmxlOjpyb3duYW1lc190b19jb2x1bW4oImdlbmUiKSAjIG1ha2UgZ2VuZSBhIGNvbHVtbgpgYGAKCllvdSBjYW4gc2VlIHRoYXQgdGhpcyB0YWJsZSBpbmNsdWRlcyB2YWx1ZXMgZm9yIGFsbCBnZW5lcywgc28gd2Ugd291bGQgbGlrZSB0byBtYWtlIGEgc2hvcnRlciBsaXN0LgoKQmVjYXVzZSB3ZSB0ZW5kIHRvIGxpa2UgW3RpZHkgZGF0YV0oaHR0cHM6Ly9yNGRzLmhhZC5jby5uei90aWR5LWRhdGEuaHRtbCksIGhlcmUgd2UgdXNlIGEgYHRpZHl2ZXJzZWAgZnVuY3Rpb24gZnJvbSB0aGUgW2BwdXJycmAgcGFja2FnZV0oaHR0cHM6Ly9wdXJyci50aWR5dmVyc2Uub3JnKSB0byBhcHBseSB0aGUgc2FtZSBvcGVyYXRpb25zIGFzIGFib3ZlIHRvIGV2ZXJ5IGVsZW1lbnQgb2YgdGhlIGBtYXJrZXJzYCBsaXN0LgpXZSB3aWxsIGludHJvZHVjZSBgcHVycnJgIGJyaWVmbHkgaGVyZSwgYnV0IGlmIHlvdSB3YW50IG1vcmUgaW5mb3JtYXRpb24gYW5kIGJhY2tncm91bmQsIHdlIHJlY29tbWVuZCB0aGUgW2BwdXJycmAgY2hlYXRzaGVldCAoUERGKV0oaHR0cHM6Ly9naXRodWIuY29tL3JzdHVkaW8vY2hlYXRzaGVldHMvcmF3L21haW4vcHVycnIucGRmKSBhbmQgSmVubnkgQnJ5YW4ncyBncmVhdCBbYHB1cnJyYCB0dXRvcmlhbF0oaHR0cHM6Ly9qZW5ueWJjLmdpdGh1Yi5pby9wdXJyci10dXRvcmlhbC9pbmRleC5odG1sKS4KCgpUaGUgbWFpbiBmdW5jdGlvbnMgaW4gYHB1cnJyYCBhcmUgdGhlIGBtYXAoKWAgZnVuY3Rpb25zLCB3aGljaCB0YWtlIGFzIHRoZWlyIG1haW4gYXJndW1lbnRzIGEgKipsaXN0KiogYW5kIGEgKipmdW5jdGlvbioqIHRvIGFwcGx5IHRvIGVhY2ggZWxlbWVudCBvZiB0aGUgbGlzdC4KVGhlIG1haW4gZnVuY3Rpb24gaXMgYHB1cnJyOjptYXAoKWA7IGlmIHlvdSBhcmUgZmFtaWxpYXIgd2l0aCB0aGUgYmFzZSBSIGBsYXBwbHkoKWAgZnVuY3Rpb24sIGl0IGlzIHZlcnkgc2ltaWxhciwgYnV0IHdpdGggc29tZSBkaWZmZXJlbnQgZGVmYXVsdHMuCldlIHdpbGwgdXNlIGl0IHRvIGdldCB0aGUgdG9wIHJvd3MgZnJvbSBlYWNoIHRhYmxlIGJ5IGFwcGx5aW5nIHRoZSBgaGVhZCgpYCBmdW5jdGlvbiB0byBlYWNoIGVsZW1lbnQgb2YgdGhlIGxpc3QuClRoZSByZXN1bHRzIGFyZSByZXR1cm5lZCBhcyBhIG5ldyBsaXN0LgoKYGBge3IgaGVhZF9tYXJrZXJzLCBldmFsID0gRkFMU0V9CnB1cnJyOjptYXAoCiAgYXMubGlzdChtYXJrZXJzWzE6M10pLCAjIHNlbGVjdCB0aGUgZmlyc3QgMyBjbHVzdGVycyBhbmQgY29udmVydCB0byBhICdyZWd1bGFyJyBsaXN0IGZvciBwdXJycgogIGhlYWQgIyB0aGUgZnVuY3Rpb24gdG8gYXBwbHkgKG5vdGUgbm8gcGFyZW50aGVzaXMpCiAgKQpgYGAKClRoaXMgcmV0dXJucyBhIGxpc3Qgb2YgZGF0YSBmcmFtZXMsIHdoaWNoIGlzbid0IHF1aXRlIHdoYXQgd2Ugd2FudC4KClRoZXJlIGlzIG5vIGJ1aWx0LWluIGZ1bmN0aW9uIHRoYXQgd2lsbCBnaXZlIHVzIGp1c3QgdGhlIGZpcnN0IGZldyBfcm93IG5hbWVzXywgc28gd2Ugd2lsbCBoYXZlIHRvIGRlZmluZSBvbmUuCkFzIG9mIHZlcnNpb24gNC4xLCBSIGludHJvZHVjZWQgYSBuZXcgYXBwcm9hY2ggdG8gZGVmaW5pbmcgX2Fub255bW91cyBmdW5jdGlvbnNfIC0gdGhhdCBpcywgZnVuY3Rpb25zIHlvdSBjYW4gcXVpY2tseSBkZWZpbmUgIm9uLXRoZS1mbHkiIHdpdGhvdXQgZm9ybWFsbHkgYXNzaWduaW5nIHRoZW0gdG8gYSBmdW5jdGlvbiBuYW1lLgpUaGV5IGFyZSBoYW5keSB3aGVuIHlvdSBuZWVkIHRvIGRvIGEgdmVyeSBzaG9ydCB0YXNrIHRoYXQgcmVxdWlyZXMgYSBmdW5jdGlvbiwgYnV0IGl0IGlzbid0IHJlYWxseSBhIGZ1bmN0aW9uIHlvdSBuZWVkIGJleW9uZCB0aGlzIGNvbnRleHQuClRoaXMgbmV3IGFub255bW91cyBzeW50YXggbG9va3MgbGlrZSB0aGlzOiBgXCh4KS4uLmAgKG9yIGZvciBzbGlnaHRseSBsb25nZXIgY29kZSwgdXNlIGN1cmx5IGJyYWNlcyBhcyBpbiBgXCh4KSB7Li4ufWApLgpUaGlzIGRlZmluZXMgYSBmdW5jdGlvbiB0aGF0IHRha2VzIG9uZSBhcmd1bWVudCwgYHhgLCB3aXRoIGAuLi5gIGluZGljYXRpbmcgd2hlcmUgeW91IHdvdWxkIHB1dCB0aGUgZXhwcmVzc2lvbiB0byBjYWxjdWxhdGUuCgpgcHVycnI6Om1hcCgpYCB3aWxsIHRoZW4gYXBwbHkgdGhlIGV4cHJlc3Npb24gaW4gb3VyIGFub255bW91cyBmdW5jdGlvbiB0byBlYWNoIGVsZW1lbnQgb2YgdGhlIGxpc3QsIGFuZCByZXR1cm4gdGhlIHJlc3VsdHMgYXMgYSBuZXcgbGlzdC4KCmBgYHtyIGhlYWRfbWFya2VybmFtZXMsIGxpdmUgPSBUUlVFfQojIEdldCB0aGUgZmlyc3QgZmV3IHJvdyBuYW1lcyBvZiBlYWNoIHRhYmxlIHdpdGggYSBwdXJyciBmdW5jdGlvbi4KcHVycnI6Om1hcCgKICAjIGNvbnZlcnQgbWFya2VycyB0byBhICdyZWd1bGFyJyBsaXN0IGZvciBwdXJycgogIGFzLmxpc3QobWFya2VycyksCiAgIyBvdXIgY3VzdG9tIGZ1bmN0aW9uOgogIFwoeCkgaGVhZCggcm93bmFtZXMoeCkgKQopCmBgYAoKQW5vdGhlciB2YXJpYW50IGlzIGBwdXJycjo6aW1hcCgpYCwgd2hpY2ggYWxsb3dzIHVzIHRvIHVzZSB0aGUgbmFtZXMgb2YgdGhlIGxpc3QgZWxlbWVudHMgaW4gb3VyIGZ1bmN0aW9uLgooVHJ5IGBuYW1lcyhtYXJrZXJzKWAgdG8gc2VlIHRoZSBuYW1lcyBmb3IgdGhlIGxpc3Qgd2UgYXJlIHdvcmtpbmcgd2l0aCBub3cuKQpXZSB3aWxsIHVzZSB0aGF0IGhlcmUgdG8gbmFtZSBvdXRwdXQgZmlsZXMgd2hlcmUgd2Ugd2lsbCBwcmludCBlYWNoIG9mIHRoZSBtYXJrZXIgdGFibGVzLCBvbmUgZm9yIGVhY2ggY2VsbCB0eXBlLgpXZSBhcmUgYWdhaW4gZGVmaW5pbmcgYSBjdXN0b20gZnVuY3Rpb24gd2l0aGluIHRoZSBjYWxsIHRvIGBwdXJycjppbWFwKClgIHVzaW5nIHRoZSBgXCh4KWAgc3ludGF4LCBidXQgdGhpcyB0aW1lIHdlIG5lZWQgdHdvIHZhcmlhYmxlczogd2Ugd2lsbCB1c2UgYHRhYmxlYCBmb3IgdGhlIGxpc3QgZWxlbWVudHMgKGVhY2ggYSB0YWJsZSBvZiByZXN1bHRzKSBhbmQgYGlkYCBmb3IgdGhlaXIgbmFtZXMuClNvLCB3ZSdsbCBhY3R1YWxseSBzdGFydCBieSBkZWZpbmluZyB0aGUgZnVuY3Rpb24gYXMgYFwodGFibGUsIGlkKWAsIHNpbmNlIHRoZXJlIHdpbGwgYmUgdHdvIGlucHV0IGFyZ3VtZW50cy4KQmVjYXVzZSB3ZSBkb24ndCBrbm93IHRoZSBpZGVudGl0aWVzIG9mIHRoZSBjbHVzdGVycyB3ZSBpZGVudGlmaWVkLCB0aGVzZSBhcmUganVzdCB0aGUgY2x1c3RlciBudW1iZXJzIGZvciBub3cuCgpNYWtpbmcgZmlsZSBuYW1lcyBmcm9tIG51bWJlcnMgY2FuIGJlIGEgYSBiaXQgZnJhdWdodCwgYXMgd2UgcmVhbGx5IHdhbnQgdGhlbSB0byBzb3J0IGluIG51bWVyaWNhbCBvcmRlciwgYnV0IG1hbnkgc3lzdGVtcyB3aWxsIHNvcnQgYnkgYWxwaGFiZXRpY2FsIG9yZGVyLgpVbmZvcnR1bmF0ZWx5LCB0aGF0IHdvdWxkIHRlbmQgdG8gc29ydCAxMC0xOSBiZWZvcmUgMiwgMjAtMjkgYmVmb3JlIDMsIGV0Yy4KVG8gc29sdmUgdGhpcywgd2UgYXJlIHVzaW5nIHRoZSBgc3ByaW50ZigpYCBmdW5jdGlvbiwgd2hpY2ggYWxsb3dzIHVzIHRvIHNwZWNpZnkgdGhlIGZvcm1hdCBvZiBhIHByaW50ZWQgc3RyaW5nLgpJbiB0aGlzIGNhc2UsIHdlIGFyZSB1c2luZyB0aGUgZm9ybWF0dGluZyBzeW50YXggb2YgYCUwMmRgIHRvIHRlbGwgaXQgdGhhdCB3ZSB3aWxsIHdhbnQgdG8gaW5zZXJ0IChgJWApIGEgbnVtYmVyIChgZGApLCB3aXRoIHR3byBkaWdpdHMgYW5kIGxlYWRpbmcgemVyb3MuClRvIHNlZSB3aGF0IHRoaXMgZG9lcyBhIGJpdCBtb3JlIGNvbmNyZXRlbHksIGxldCdzIGxvb2sgYXQgYSBzaW1wbGUgZXhhbXBsZToKCmBgYHtyIHNwcmludGZ9CnNwcmludGYoIiUwMmQiLCAxOjEwKQpgYGAKCgpJbiBhZGRpdGlvbiB0byB3cml0aW5nIHRoZSB0YWJsZXMgb3V0LCB3ZSBhcmUgc2F2aW5nIHRoZSBkYXRhIGZyYW1lcyB3ZSBjcmVhdGVkIGFzIGEgbmV3IGxpc3QgdGhhdCB3ZSBjYW4gdXNlIGluIHRoZSBuZXh0IHN0ZXAuCgpgYGB7ciB3cml0ZV90YWJsZXN9Cm1hcmtlcl9kZl9saXN0IDwtIHB1cnJyOjppbWFwKAogIGFzLmxpc3QobWFya2VycyksICMgY29udmVydCBtYXJrZXJzIHRvIGEgJ3JlZ3VsYXInIGxpc3QgZm9yIHB1cnJyCiAgIyBwdXJyciBmdW5jdGlvbjogeCBpcyB0aGUgbGlzdCBlbGVtZW50LCB5IGlzIHRoZSBlbGVtZW50IG5hbWUgKG51bWJlciBoZXJlKQogIFwodGFibGUsIGlkKSB7CiAgICBhcy5kYXRhLmZyYW1lKHRhYmxlKSB8PiAjIGZpcnN0IGNvbnZlcnQgdG8gYSBkYXRhIGZyYW1lCiAgICAgIHRpYmJsZTo6cm93bmFtZXNfdG9fY29sdW1uKCJnZW5lIikgfD4gIyBtYWtlIGdlbmVzIGEgY29sdW1uCiAgICAgIGRwbHlyOjphcnJhbmdlKEZEUikgfD4gIyBzb3J0IHRvIGJlIHN1cmUgc21hbGwgRkRSIGdlbmVzIGFyZSBmaXJzdAogICAgICByZWFkcjo6d3JpdGVfdHN2KCAjIHdyaXRlIGVhY2ggZGF0YSBmcmFtZSB0byBhIGZpbGUKICAgICAgICBmaWxlLnBhdGgoCiAgICAgICAgICBtYXJrZXJfZGlyLCAjIGNvbnN0cnVjdCB0aGUgb3V0cHV0IHBhdGgKICAgICAgICAgIHNwcmludGYoImNsdXN0ZXIlMDJkX21hcmtlcnMudHN2IiwgYXMuaW50ZWdlcihpZCkpICMgZm9ybWF0IGNsdXN0ZXIgbnVtYmVycyBpbiBmaWxlIG5hbWVzIHdpdGggbGVhZGluZyB6ZXJvcwogICAgICAgICkKICAgICAgKQogIH0KKQpgYGAKCgojIyMgUGxvdHRpbmcgbWFya2VyIGdlbmUgZXhwcmVzc2lvbgoKT25lIHRoaW5nIHdlIGNhbiBkbyB3aXRoIHRoaXMgbGlzdCBvZiBtYXJrZXIgZ2VuZXMgaXMgdG8gc2VlIGhvdyB0aGV5IGxvb2sgYWNyb3NzIHRoZSBjZWxscyBhbmQgY2x1c3RlcnMuClRoZSBgc2NhdGVyOjpwbG90UmVkdWNlZERpbSgpYCBmdW5jdGlvbiBtYWtlcyB0aGlzIGVhc3khCldlIGhhdmUgZWFybGllciBjb2xvcmVkIHBvaW50cyBieSBzb21lIGNlbGwgc3RhdGlzdGljLCBsaWtlIHRoZSBudW1iZXIgb2YgZXhwcmVzc2VkIGdlbmVzLCBidXQgaXQgaXMganVzdCBhcyBlYXN5IHRvIGNvbG9yIGJ5IHRoZSBleHByZXNzaW9uIG9mIGEgc2luZ2xlIGdlbmUgYnkgdXNpbmcgdGhlIGdlbmUgaWRlbnRpZmllciBhcyB0aGUgYGNvbG9yX2J5YCBhcmd1bWVudC4KClRoZSBmaXJzdCBzdGVwIGlzIHRvIGdldCB0aGUgZ2VuZSBpbmZvcm1hdGlvbiBmb3IgdGhlIGdlbmVzIHdlIG1pZ2h0IGJlIGludGVyZXN0ZWQgaW4uCgpgYGB7ciBtYXJrZXJfaW5mbywgbGl2ZSA9IFRSVUV9CiMgZ2V0IGdlbmUgaWRzIGZvciB0b3AgMTAgY2x1c3RlciAxIG1hcmtlcnMKZ2VuZV9pZHMgPC0gbWFya2VyX2RmX2xpc3RbWzFdXSB8PgogIGhlYWQobiA9IDEwKSB8PgogIGRwbHlyOjpwdWxsKGdlbmUpCgojIGxvb2sgYXQgdGhlIGdlbmUgaW5mbyBmb3IgdGhlc2UKZ2VuZV9pbmZvIDwtIHJvd0RhdGEoaG9kZ2tpbnNfc2NlKVtnZW5lX2lkcywgXQpkYXRhLmZyYW1lKGdlbmVfaW5mbykKYGBgCgpOb3cgd2UgY2FuIHBpY2sgb25lIG9mIHRoZSBnZW5lcyBmb3IgcGxvdHRpbmcgYW5kIGdvIQoKYGBge3IgcGxvdF9tYXJrZXJfZXhwcmVzc2lvbiwgbGl2ZSA9IFRSVUV9CiMgZ2V0IGdlbmUgaWQgYW5kIGdlbmUgc3ltYm9sIGZvciBuaWNlciBwbG90dGluZwpyYW5rIDwtIDEKZ2VuZV9pZCA8LSBnZW5lX2luZm8kSURbcmFua10Kc3ltYm9sIDwtIGdlbmVfaW5mbyRTeW1ib2xbcmFua10KCiMgUGxvdCBVTUFQIHJlc3VsdHMgY29sb3JlZCBieSBleHByZXNzaW9uCnBsb3RSZWR1Y2VkRGltKGhvZGdraW5zX3NjZSwgIlVNQVAiLAogICAgICAgICAgICAgICBjb2xvcl9ieSA9IGdlbmVfaWQpICsKICAjIGxhYmVsIHRoZSBndWlkZSB3aXRoIHRoZSBnZW5lIHN5bWJvbAogIGd1aWRlcyhjb2xvciA9IGd1aWRlX2NvbG9yYmFyKHRpdGxlID0gc3ltYm9sKSkKYGBgCgoKSG9wZWZ1bGx5IHRoYXQgZXhwcmVzc2lvbiBwYXR0ZXJuIGFsaWducyBhdCBsZWFzdCBpbiBwYXJ0IHdpdGggeW91ciBleHBlY3RhdGlvbnMhCgojIyBOZXh0IHN0ZXBzCgpTbyBmYXIgd2UgaGF2ZSBpZGVudGlmaWVkIGNsdXN0ZXJzIG9mIGNlbGxzIChpZiB5b3UgYmVsaWV2ZSB0aGVtKSwgYW5kIGZvdW5kIHNvbWUgZ2VuZXMgdGhhdCBhcmUgYXNzb2NpYXRlZCB3aXRoIGVhY2ggY2x1c3Rlci4KV2hhdCB5b3UgbWlnaHQgd2FudCB0byBrbm93IGF0IHRoaXMgcG9pbnQgaXMgd2hhdCAqY2VsbCB0eXBlcyogY29tcHJpc2UgZWFjaCBjbHVzdGVyLgpTZXR0aW5nIGFzaWRlIHRoZSB0aG9ybnkgcXVlc3Rpb24gb2YgIndoYXQgaXMgYSBjZWxsIHR5cGU/IiwgdGhpcyBpcyBzdGlsbCBhIGNoYWxsZW5naW5nIHByb2JsZW0sIGFuZCB3ZSdsbCBleHBsb3JlIHNvbWUgYXBwcm9hY2hlcyB0byBwZXJmb3JtIGNlbGwgdHlwZSBhbm5vdGF0aW9uIGluIHRoZSBuZXh0IG5vdGVib29rIQoKCiMjIFNlc3Npb24gSW5mbwoKYGBge3Igc2Vzc2lvbn0Kc2Vzc2lvbkluZm8oKQpgYGAK
+ + +
+
+ +
+ + + + + + + + + + + + + + + + + diff --git a/completed-notebooks/scRNA-seq/06-celltype_annotation.nb.html b/completed-notebooks/scRNA-seq/06-celltype_annotation.nb.html new file mode 100644 index 0000000..097283d --- /dev/null +++ b/completed-notebooks/scRNA-seq/06-celltype_annotation.nb.html @@ -0,0 +1,4509 @@ + + + + + + + + + + + + + + + +Annotating cell types from scRNA-seq data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + +
+

Objectives

+

This notebook will demonstrate how to:

+
    +
  • Explore data from antibody-derived tags (CITE-seq)
  • +
  • Apply simple rule-based classification to identify cell types
  • +
  • Identify cell types by similarity to reference datasets with +SingleR
  • +
  • Apply SingleR classification to groups of cells
  • +
+
+

In this notebook, we will attempt to annotate cell types to each of +the cells in a dataset, using some of the automated tools that are +available within the Bioconductor universe.

+

Much of the material in this notebook is directly inspired by, and +draws heavily on, material presented in the book Orchestrating Single +Cell Analysis with Bioconductor.

+

The data we will use for this notebook is derived from a 10x +Genomics dataset of human peripheral blood mononuclear cells +(PBMCs). These data include both single cell RNA-seq counts and +quantification of antibody-derived tags (ADTs) performed by sequencing +short DNA barcodes attached to specific antibodies. This type of ADT +sequencing with single cells is commonly known as CITE-seq, after the +protocol developed by Stoeckius et al. +(2017).
+The antibodies used here are the The +TotalSeq™-B Human TBNK Cocktail, a set of antibodies designed to +react with immune cell surface markers.

+
+Single-cell roadmap: Cell type +
Single-cell roadmap: Cell type
+
+

The data here have already been filtered, normalized, and had +dimension reductions calculated for the single-cell RNA-seq data. The +ADT data has also been separately filtered and normalized. For details +about how to perform these tasks with data that has been processed with +Cell Ranger, you may want to look at the “Integrating +with protein abundance” chapter of OSCA.

+

The processed gene expression and ADT data were saved into a combined +SingleCellExperiment (SCE) object, and we will start with +that object for our exploration here.

+
+
+

Set up

+

To start, we will load some of the libraries we will need later, and +set a random number seed for reproducibility.

+ + + +
# Load libraries
+library(ggplot2) # plotting functions
+library(SingleCellExperiment) # Bioconductor single-cell data class
+ + +
Loading required package: SummarizedExperiment
+ + +
Loading required package: MatrixGenerics
+ + +
Loading required package: matrixStats
+ + +

+Attaching package: 'MatrixGenerics'
+ + +
The following objects are masked from 'package:matrixStats':
+
+    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
+    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
+    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
+    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
+    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
+    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
+    colWeightedMeans, colWeightedMedians, colWeightedSds,
+    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
+    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
+    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
+    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
+    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
+    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
+    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
+    rowWeightedSds, rowWeightedVars
+ + +
Loading required package: GenomicRanges
+ + +
Loading required package: stats4
+ + +
Loading required package: BiocGenerics
+ + +

+Attaching package: 'BiocGenerics'
+ + +
The following objects are masked from 'package:stats':
+
+    IQR, mad, sd, var, xtabs
+ + +
The following objects are masked from 'package:base':
+
+    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
+    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
+    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
+    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
+    Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
+    tapply, union, unique, unsplit, which.max, which.min
+ + +
Loading required package: S4Vectors
+ + +

+Attaching package: 'S4Vectors'
+ + +
The following object is masked from 'package:utils':
+
+    findMatches
+ + +
The following objects are masked from 'package:base':
+
+    expand.grid, I, unname
+ + +
Loading required package: IRanges
+ + +
Loading required package: GenomeInfoDb
+ + +
Loading required package: Biobase
+ + +
Welcome to Bioconductor
+
+    Vignettes contain introductory material; view with
+    'browseVignettes()'. To cite Bioconductor, see
+    'citation("Biobase")', and for packages 'citation("pkgname")'.
+ + +

+Attaching package: 'Biobase'
+ + +
The following object is masked from 'package:MatrixGenerics':
+
+    rowMedians
+ + +
The following objects are masked from 'package:matrixStats':
+
+    anyMissing, rowMedians
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
+ + +
# Setting the seed for reproducibility
+set.seed(12345)
+ + + +
+

Directories and files

+

As mentioned, our input file here is a single normalized and +processed SCE object, stored as an rds file. That should be +all we need to read in!

+

Our output will be a table of per-cell information, which will +include the cell type assignments we have made throughout this notebook. +We aren’t planning any significant modifications of the underlying data, +so we won’t bother re-saving the whole SCE object as a new +.rds file this time.

+ + + +
# directory for the input data
+data_dir <- file.path("data", 
+                      "PBMC-TotalSeqB", 
+                      "normalized")
+
+# the input file itself
+sce_file <- file.path(data_dir, 
+                      "PBMC_TotalSeqB_normalized_sce.rds")
+
+# A directory to store outputs
+analysis_dir <- file.path("analysis", 
+                          "PBMC-TotalSeqB")
+
+# Create directory if it doesn't exist
+fs::dir_create(analysis_dir)
+
+# output table path
+cellinfo_file <- file.path(analysis_dir, 
+                           "PBMC_TotalSeqB_cellinfo.tsv")
+ + + +
+
+
+

Exploring a CITE-seq SingleCellExperiment

+

Now that the preliminary setup is out of the way, we can get started. +First we will read in the SingleCellExperiment from the +input file we defined earlier.

+ + + +
# read in the SCE file
+sce <- readr::read_rds(sce_file)
+# print a summary of the SCE
+sce
+ + +
class: SingleCellExperiment 
+dim: 36601 7924 
+metadata(1): Samples
+assays(2): counts logcounts
+rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
+  ENSG00000277196
+rowData names(3): ID Symbol Type
+colnames: NULL
+colData names(13): Sample Barcode ... prob_compromised sizeFactor
+reducedDimNames(2): PCA UMAP
+mainExpName: Gene Expression
+altExpNames(1): ADT
+ + + +

This should look similar to the SCE objects that we have seen before, +containing counts and logcounts assays where +each cell is a column and each row is a gene. We also have some of the +rowData, colData and reduced dimension +matrices that we have seen before.

+

But where are the data from the ADTs? We wouldn’t necessarily want +those stored in the main data matrices, as the characteristics of ADT +barcode data is going to be quite different from gene expression +data.

+

To keep the ADT data separate from the RNA gene expression data, we +have split this data off into an alternative experiment +(altExp) slot. You can see the name of this +altExp on the line altExpNames above. We +could have more than one type of alternative experiment (such +as spike-in or ATAC-seq), but in this case, just the one.

+

To access the contents of the altExp slot, we can use +the altExp() function. Let’s look at what we have in that +slot:

+ + + +
# print a summary of the 'ADT' altExp
+altExp(sce, "ADT")
+ + +
class: SingleCellExperiment 
+dim: 10 7924 
+metadata(1): Samples
+assays(2): counts logcounts
+rownames(10): CD3 CD4 ... CD45 IgG1
+rowData names(3): ID Symbol Type
+colnames: NULL
+colData names(1): sizeFactor
+reducedDimNames(0):
+mainExpName: NULL
+altExpNames(0):
+ + + +

It is another SingleCellExperiment! Inception! Let’s +look at that embedded SCE more closely.

+

The first thing to note is that this altExp has the same +number of columns as did the main SCE object. Those corresponded to the +individual cells before, and still do!

+

There are only 10 rows, however, and these correspond to the ADTs +that were assayed by this particular experiment. Just as we did with the +full SCE, we can use rowData() to view the table containing +metadata associated with each of these rows. We’ll add the +altExp() function to point it to the embedded object we are +interested in. Since there is only one altExp, we don’t +need the second (name) argument ("ADT") that we used above; +the default behavior of altExp() is to just give us the +first altExp, and that is the one (and only) that we +need.

+ + + +
# What proteins were assayed?
+rowData(altExp(sce))
+ + +
DataFrame with 10 rows and 3 columns
+               ID                 Symbol             Type
+      <character>            <character>      <character>
+CD3           CD3                    CD3 Antibody Capture
+CD4           CD4                    CD4 Antibody Capture
+CD8           CD8                    CD8 Antibody Capture
+CD11c       CD11c                  CD11c Antibody Capture
+CD14         CD14                   CD14 Antibody Capture
+CD16         CD16                   CD16 Antibody Capture
+CD19         CD19                   CD19 Antibody Capture
+CD56         CD56                   CD56 Antibody Capture
+CD45         CD45                   CD45 Antibody Capture
+IgG1         IgG1 IgG1_control_TotalSeqC Antibody Capture
+ + + +

You can see here the names and symbols of the tags used, along with +the designation that all have an “Antibody Capture” type (as opposed to +“Gene Expression” for the RNA data). One you might note looks different +is the IgG1 control, which is actually a mouse antibody +used as a negative control.

+
+

Clustering redux

+

While dimension reduction was performed on this data, we have not yet +performed any clustering.

+

Let’s assign some clusters to our cells, using graph-based clustering +and default parameters, taking as input the PCA matrix that was +previously calculated. Note that this PCA matrix and the UMAP built from +it were derived from the gene expression data, so the clustering is +going to reflect the gene expression data only. While we have the ADT +data, it is not being used for this stage of the analysis.

+ + + +
# perform clustering
+nn_clusters <- bluster::clusterRows(
+  # PCA input
+  reducedDim(sce, "PCA"), 
+  # graph clustering & parameters
+  bluster::NNGraphParam()
+)
+
+# add clusters to colData
+sce$nn_cluster <- nn_clusters
+ + + +

Now we can plot the clusters we have identified with +scater::plotUMAP(). This is a shortcut for +scater::plotReducedDim(dimred = "UMAP", ...), which can +save us a lot of typing as we do this repeatedly!

+ + + +
# plot clusters
+scater::plotUMAP(sce, color_by = "nn_cluster") + 
+  # rename the legend
+  guides(color = guide_legend(title = "Cluster"))
+ + +

+ + + +

But what are these clusters, really? Do they correspond to particular +cell types that we are interested in?

+

Does it bother you that we just used the default nearest-neighbor +graph clustering parameters? Do you know what those were?

+
+
+
+

Investigating cell types

+
+

Using ADT data

+

The first way we will identify cell types of individual cells is to +use the ADT normalized counts. These antibody markers were (hopefully) +chosen for their relevance to the sequenced cell population.

+

The first marker we will look at is CD3, which is a +protein complex that is found on the surface of T cells. We can again +use the plotUMAP() function to color cells by +CD3 ADT levels.

+

Note that this function can plot data from the colData +table (as we used it above when plotting clusters), in the main gene +expression matrix (as we used it in the previous notebook), AND +in altExp tables and matrices! So to color by the ADT +levels (as normalized in the logcounts matrix) we only need +to provide the tag name that we want to plot in the +color_by argument.

+ + + +
# plot CD3 expression
+scater::plotUMAP(sce, color_by = "CD3")
+ + +

+ + + +

It appears that we have a number of potential T cells down in the +lower left!

+

Let’s look at a couple of other markers to try to break those up more +specifically.

+

Two other markers of relevance to the T cells are CD4 +and CD8. The CD4 complex is present in helper +T cells (hence their other common name, CD4+ T cells). By contrast, the +CD8 complex is found on killer T cells (CD8+ cells).

+

Let’s plot the ADT results for those two markers as well below:

+ + + +
# plot CD4 marker
+scater::plotUMAP(sce, 
+                 color_by = "CD4")
+ + +

+ + + + + + +
# plot CD8 marker
+scater::plotUMAP(sce, 
+                 color_by = "CD8")
+ + +

+ + + +
+
+

Rule-based classification

+

Plotting the levels of the ADTs provides a nice visual +representation, but what we really want to do is to turn these values +into specific cell-type assignments for each cell. Such classification +could be considered as analogous to a cell-sorter assay, where we would +set up some rules to look at a few markers for each cell and use those +to assign a cell type. The simplest type of rule might be one where we +use a threshold to call a marker as present or absent, and then use the +presence of a marker to indicate a specific cell type.

+

To do this, we will need to make some decisions, such as the +thresholds we should use to determine whether a cell is or is not +expressing a particular marker. In general, markers that are useful for +this cell-typing approach will have a bimodal distribution of expression +levels which can be used to separate the population into two groups of +cells. One group of cells will have only a background level signal for +each marker (due to non-specific binding or other factors), while the +other group, those that express the protein, will have a much higher +level of binding and higher counts.

+

To assess whether the ADTs we have chosen have a useful distribution +of expression values, and to identify thresholds we might use, we would +like to plot each ADT tag. To do this, we will pull out the expression +values for these markers from the SCE object and do some data +wrangling.

+

We are interested in the normalized counts for the ADT tags, which +are stored in the logcounts assay of the +altExp. If you recall, this matrix is stored with the +columns as cells and rows as markers, but we really want it with each +row a cell and each column a marker. So we will first transpose the +data, then convert it to a data frame for our next steps. Because the +SCE object stores the assay data matrices in a specialized format, we +have to do one extra step convert it first to a “regular” R matrix or R +won’t know how to convert it to a data frame.

+ + + +
# convert logcounts data to a data frame
+adt_df <- logcounts(altExp(sce)) |>
+  t() |> # transpose
+  as.matrix() |> # convert to matrix
+  as.data.frame() # convert to data frame
+
+# view the data frame
+head(adt_df)
+ +
+ +
+ + +

If we just wanted to plot one of these tags, we could do so right +away, but with a bit more data wrangling, we can convert these results +into a “tidier” format, that will allow us to take full advantage of +tidyverse tools! In particular, it will let us plot them +all at once with ggplot2 faceting.

+

Right now the data is in a “wide” format, such that each column is a +different tag. But the data in all of the columns is the same type, and +measures something similar: the normalized count of an ADT. One could +even argue that each row contains 10 different observations, where the +“tidy” data ideal, as espoused by Wickham (2014), +requires a single observation per row, a “long” format. This long format +will have one column that tells us which ADT was measured and a second +column with the measurement value itself.

+

We can perform this conversion using the tidyr::pivot_longer() +function, which allows us to convert our data frame with one column per +tag into a data frame with separate columns for the tag id +(ADT) and the expression value (logcount). +Following conversion, we will filter to just the ADTs that we care +about.

+ + + +
adt_df_long <- adt_df |>
+  # pivot to long format
+  tidyr::pivot_longer(
+    everything(), # use all columns
+    names_to = "ADT", # convert row names to a column called "ADT"
+    values_to = "logcount" # name the value column "logcount"
+  ) |>
+  # filter to tags we are interested in
+  dplyr::filter(ADT %in% c("CD3", "CD4", "CD8"))
+
+# look at the resulting df
+head(adt_df_long)
+ +
+ +
+ + +

Now we can make a density plot with ggplot2 for all +three ADTs we are interested in at once.

+ + + +
# plot logcounts by ADT
+ggplot(adt_df_long, aes(x = logcount, fill = ADT)) + 
+  geom_density() + # density plot
+  facet_grid(rows = vars(ADT)) + # facet by ADT
+  theme_bw() + # nicer theme
+  theme(legend.position = "none") # no legend needed
+ + +

+ + + +

These look pretty good! Each of these markers has a bimodal +distribution: A lower peak consisting of cells that do not express the +protein but which still have a background level of antibody binding, and +an upper peak of cells that do express the protein of interest. The +background level does vary by antibody marker, so we will need a +different threshold value for each one.

+

We can now use the values from these plots to construct a set of +rules to classify the T cells. We will do this using the “wide” data +frame from earlier.

+

The thresholds we are using here were identified just “by eye”, so +this is not a particularly principled method of cell type assignment, +but it can be fairly effective. Here we are assigning only three cell +types; cells that do not fit any of these criteria will be set as +NA.

+ + + +
# add cell type column by thresholding
+adt_df <- adt_df |>
+  dplyr::mutate(
+    celltype = dplyr::case_when(
+      CD3 > 6.7 & CD4 > 8 ~ "CD4+ T-cell",
+      CD3 > 6.7 & CD8 > 6 ~ "CD8+ T-cell",
+      CD3 > 6.7 ~ "T-cell"
+    )
+  )
+
+adt_df
+ +
+ +
+ + +

Now we will want to add the cell types we have assigned back to our +original SCE object. We can do that by defining a new column name, +threshold_celltype that will be added to the +colData object. Creating and assigning values to this +column can be done with the $ shortcut, and then we can +plot our results with the plotUMAP() function as +before.

+ + + +
sce$threshold_celltype <- adt_df$celltype
+scater::plotUMAP(sce, 
+                 color_by = "threshold_celltype") + 
+  guides(color = guide_legend(title = "Cell type"))
+ + +

+ + + +

How did we do?

+

Note that while we applied this technique to assign cell types using +the ADT data, we could use the same type of procedure using gene +expression data alone, or a combination of gene expression data and tag +data.

+

However, what we did here was very ad-hoc and quite manual! We didn’t +calculate any statistics, and we had to look at every tag we were +interested in to pick thresholds. A different dataset might have +different background levels, which would require different +thresholds.

+

While this technique might be good for some simple experiments, and +can be useful for manual curation, it might not translate well to more +complex datasets with multiple samples. We also looked at each marker +separately, which might not be the most efficient or robust method of +analysis.

+

For a more principled approach that allows identification of cell +types by looking at the expression of sets of genes that are known to +characterize each cell type, you might look at the AUCell +package. For more on that method, the OSCA section Assigning +cell labels from gene sets is a very good reference.

+
+
+
+

Cell type annotation with SingleR

+

An alternative approach to using known marker genes for +classification is to instead classify cells by comparing them to a +reference expression dataset. To do this, we will find a well-curated +gene expression dataset that contains samples with known cell types. We +can then train a model based on this dataset and look at each of the +cells in our new dataset to determine which (if any) of the known cell +types has the most similar expression pattern. The details of how such a +model may be constructed and trained will vary by the specific method, +but this overall approach is widely applied.

+

For this section, we will focus on the SingleR package +and its methods, which are described in detail in The SingleR +Book.

+
+

Reference datasets

+

Selecting a reference dataset is one of the more critical steps for +this enterprise. At the most basic level, if the reference dataset does +not include the types of cells that we expect to see in our sample, it +won’t be useful. So we will want a reference dataset that has as many as +possible of the cell types that we expect to find in our dataset, at a +level of granularity that aligns with our goals.

+

For SingleR that reference data can be from bulk RNA +sequencing or from other single-cell experiments. SingleR +is also fairly robust to the method used for gene expression +quantification, which means that we can use either RNA-seq datasets or +microarrays, if those are more readily available.

+

One convenient source of cell reference data is the +celldex package, which is what we will use here. This +package includes functions to download a variety of well-annotated +reference datasets in a common format.
+For more information on the datasets available, you will want to refer +to the +celldex summary vignette.

+

We will start by using a reference dataset of sorted immune cells +from GSE107011 +(Monaco et al. 2019). This particular reference was chosen +because it is well-suited to PBMC datasets, with a good level of +granularity.

+

The celldex functions also have a convenient option to +convert gene symbols to Ensembl ids, which we will use here so that our +reference data uses the same gene identifiers as the single-cell +data.

+ + + +
# Bioconductor "Hub" packages provide the option to cache
+#   downloads, but the interactive prompt can be annoying
+#   when working with notebooks.
+# These options disable the prompt by giving permission 
+#   to create the cache automatically
+ExperimentHub::setExperimentHubOption("ASK", FALSE)
+AnnotationHub::setAnnotationHubOption("ASK", FALSE)
+
+# Get Monaco 2019 data from celldex with Ensembl ids.
+monaco_ref <- celldex::MonacoImmuneData(ensembl = TRUE)
+ + +
Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
+'DelayedArray::makeNindexFromArrayViewport' when loading 'HDF5Array'
+ + +
downloading 1 resources
+ + +
retrieving 1 resource
+ + +
loading from cache
+ + +
require("ensembldb")
+ + + +

What is this monaco_ref object?

+ + + +
monaco_ref
+ + +
class: SummarizedExperiment 
+dim: 46077 114 
+metadata(0):
+assays(1): logcounts
+rownames(46077): ENSG00000121410 ENSG00000268895 ... ENSG00000159840
+  ENSG00000074755
+rowData names(0):
+colnames(114): DZQV_CD8_naive DZQV_CD8_CM ... G4YW_Neutrophils
+  G4YW_Basophils
+colData names(3): label.main label.fine label.ont
+ + + +

A SummarizedExperiment is very similar to a +SingleCellExperiment, except rather than having one column +per cell, each column is a sample. Otherwise, the components +are very similar: each row is still a gene, for example, and additional +data about the samples are stored in the colData. In fact, +the SingleCellExperiment object is derived from a +SummarizedExperiment, with some extra slots that are more +relevant to single-cell data.

+

What information do we have for the samples?

+ + + +
colData(monaco_ref)
+ + +
DataFrame with 114 rows and 3 columns
+                      label.main             label.fine   label.ont
+                     <character>            <character> <character>
+DZQV_CD8_naive      CD8+ T cells      Naive CD8 T cells  CL:0000900
+DZQV_CD8_CM         CD8+ T cells Central memory CD8 T..  CL:0000907
+DZQV_CD8_EM         CD8+ T cells Effector memory CD8 ..  CL:0000913
+DZQV_CD8_TE         CD8+ T cells Terminal effector CD..  CL:0001062
+DZQV_MAIT                T cells             MAIT cells  CL:0000940
+...                          ...                    ...         ...
+G4YW_NK                 NK cells   Natural killer cells  CL:0000623
+G4YW_pDC         Dendritic cells Plasmacytoid dendrit..  CL:0000784
+G4YW_mDC         Dendritic cells Myeloid dendritic ce..  CL:0000782
+G4YW_Neutrophils     Neutrophils Low-density neutroph..  CL:0000096
+G4YW_Basophils         Basophils  Low-density basophils  CL:0000043
+ + + +

There are three main columns for the sample data:

+
    +
  • label.main is a more general cell type +assignment.

  • +
  • label.fine is a fine-level cell type with more +specific labels. The exact level of granularity of these +main and fine designations (and indeed the +label names themselves) will vary among datasets, so it is important to +look at the reference to see whether it is suitable for your +application.

  • +
  • label.ont is a standardized Cell Ontology +identifier. Using the cell ontology can allow for more complex +representations of the relationships among different cell types, but +investigating that is beyond the scope of this workshop.

  • +
+

Another component we would like to explore is how many of each of +these cell types we have in the reference dataset. A bit of quick +dplyr wrangling can give us the answer.

+ + + +
colData(monaco_ref) |> 
+  as.data.frame() |>
+  dplyr::count(label.main, label.fine)
+ +
+ +
+ + +

This is pretty good! Most cell types have 4 replicates, which is more +replicates than we often find.

+
+
+

What does SingleR do?

+

As mentioned earlier, SingleR builds a model from a set +of training data, and then uses that model to classify cells (or groups +of cells) in new datasets.

+

SingleR works by first identifying a set of marker genes +that can be used to differentiate among the cell types in the reference +dataset. It does this by performing pairwise comparisons among all of +the cell types, and retaining the top set of genes differentiating each +pair. The idea is that this set of genes will be the most informative +for differentiating cell types.

+

Then, for each cell, SingleR calculates the Spearman +correlation between expression of that cell and each cell type (using +the only the genes chosen earlier). Notably, this is a non-parametric +correlation, so the scaling and normalization that we apply (or don’t) +should not matter! Note that if you used a single-cell technology that +produces full-length transcripts (i.e., SMART-seq), you will probably +want to convert your counts to Transcripts per Million (TPM), to allow +more consistent ranking among transcripts of different lengths.

+

The reference cell type with the highest correlation is then chosen +as the cell type assignment for that cell. If there are multiple cell +types with high scores, an optional fine-tuning step repeats the process +using only the most relevant genes for those cell types.

+
+
+

Running SingleR

+

For our first run, we will do the marker gene selection (training) +and classification in a single step, using the convenience function +SingleR::SingleR(). For this we need only supply three main +arguments: Our SCE object, a reference matrix (here in +SummarizedExperiment format), and the labels for each of +the samples in the reference that we want to use. We also need to be +sure that our sample and the reference data use the same gene IDs, which +is why we requested the Ensembl IDs when getting the reference +dataset.

+

Because this function is doing many repetitive calculations (lots of +correlations!), we can speed it up by including the BPPARAM +argument. This is a common argument in Bioconductor +packages where BP stands for the BiocParallel +package, which provides multiprocessing capabilities to many +Bioconductor functions. In this case, we will use the argument +BiocParallel::MulticoreParam(4) to specify we want to use +local multicore processing with 4 “workers”.

+ + + +
# calculate SingleR results in one step
+singler_result <- SingleR::SingleR(
+  sce, # our query SCE
+  ref = monaco_ref, # reference dataset
+  labels = monaco_ref$label.main, # reference labels to use
+  BPPARAM = BiocParallel::MulticoreParam(4) # multiprocessing
+)
+ + + +

SingleR provides a few nice visualizations for +evaluating the statistics it calculated and the assignments it makes. +One is a heatmap of the scores for each cell, arranged by the cell type +that was assigned to each. This is created with the +SingleR::plotScoreHeatmap() function.

+ + + +
SingleR::plotScoreHeatmap(singler_result)
+ + +

+ + + +

We can also pull out individual components of the results object for +plotting in the context of our input SCE object. Here we will save the +pruned labels (where low-quality assignments have been given an +NA label), storing them back in our SCE object +(specifically to a new column of the colData table).

+ + + +
sce$celltype_main <- singler_result$pruned.labels
+ + + +

Now we can plot the cell type assignments onto our UMAP to see how +they compare to the patterns we saw there before.

+ + + +
scater::plotUMAP(sce, color_by = "celltype_main") 
+ + +

+ + + +

Annoyingly, the NA and T cells labels are +quite close in color, and the scater and +SingleR packages don’t agree on color choices. Luckily, +since plotUMAP() returns a ggplot object, we +can modify the color palette using ggplot2 functions. Still +annoyingly, however, when we change the palette, the legend title +defaults to the uninformative name "colour_by", so we’ll +also specify a matching legend title with our new color palette.

+ + + +
scater::plotUMAP(sce, color_by = "celltype_main") +
+  scale_color_brewer(name = "Cell type", # legend title
+                     palette = "Dark2",      # color palette
+                     na.value = "gray80")    # use light gray for NA values
+ + +
Scale for colour is already present.
+Adding another scale for colour, which will replace the existing scale.
+ + +

+ + + +

We seem to have a pretty good set of cell type assignments, with most +falling into groupings consistent with what we see in the UMAP plot.

+

We can thank the fact that this is a PBMC sample and that we have a +good reference dataset for these cell types for the cleanliness of this +plot. Quite often with other kinds of samples (especially cancer cells!) +things will be much less clean!

+

We can also look to see how the cell type assignments are distributed +using the base R function table(). Since we like to keep +track of the cells that ended up as NA in the pruned +labels, we will include the useNA = "ifany" argument.

+ + + +
table(singler_result$pruned.labels, useNA = "ifany")
+ + +

+        B cells    CD4+ T cells    CD8+ T cells Dendritic cells       Monocytes 
+            692            1291             904             232            3622 
+       NK cells     Progenitors         T cells            <NA> 
+            345              47             646             145 
+ + + +
+
+

Exploring finer labels

+

In the previous cell typing, we used the label.main +column, but we also had label.fine, so let’s use that to +explore the dataset in a bit more detail.

+

We will also take this time to dive a bit deeper into the steps that +SingleR performed. As mentioned, the first step is training +the model, during which we identify the genes that will be used for the +correlation analysis later. While this step is not particularly slow, if +we were classifying multiple samples, we would not want to have to +repeat it for every sample.

+

To do the training, we will use the trainSingleR() +function. For this we will start with our reference and the labels we +want to train the model with.

+

We can then specify the method used to select the genes that will be +used for classification. The default method is "de", which +performs a differential expression analysis for each pair of labels, but +we could also use "sd" to select the genes which are most +variable across labels, or "all" to use all genes. If we +want to get really fancy, we could even provide a specific list of genes +to use.

+

We should note here that the reference dataset for +SingleR does not need to be from a compendium like +celldex! If you have any well-classified dataset that you +want to use as a reference, you can, as long as you can create a gene by +sample expression matrix and a vector of cell types for each sample. You +will want to ensure that the cell types you expect to see in your sample +are present in the reference dataset, and data should be normalized, but +otherwise the method can be quite flexible. You can even use a +previously-annotated SingleCellExperiment as a reference +for a new dataset. For more details about custom references, see the OSCA +chapter on cell type annotation

+

We do want to be sure that the genes selected for the model will be +among those present in our SCE object, so we will use the +restrict argument with a vector of the genes in our SCE. +This step would happen automatically with the +SingleR::SingleR() function, but we need to add it manually +for this use case.

+ + + +
# build fine model
+singler_finemodel <- SingleR::trainSingleR(
+  monaco_ref, # reference dataset
+  labels = monaco_ref$label.fine, # labels for training dataset
+  # use DE to select genes (default)
+  genes = "de", 
+  # only use genes in the sce object
+  restrict = rownames(sce),
+  # parallel processing
+  BPPARAM = BiocParallel::MulticoreParam(4)
+)
+ + + +

Now we can perform the classification step, using our SCE object and +the SingleR model that we just created.

+ + + +
# classify with fine model
+singler_result_fine <- SingleR::classifySingleR(
+  sce, # our SCE object
+  singler_finemodel, # the trained model object
+  # perform fine tuning (default)
+  fine.tune = TRUE,
+  # parallel processing
+  BPPARAM = BiocParallel::MulticoreParam(4)
+)
+ + + +

What labels were assigned, and how many of each?

+ + + +
table(singler_result_fine$pruned.labels, useNA = "ifany")
+ + +

+   Central memory CD8 T cells           Classical monocytes 
+                          121                          2926 
+  Effector memory CD8 T cells             Exhausted B cells 
+                           31                            48 
+    Follicular helper T cells        Intermediate monocytes 
+                          135                           424 
+        Low-density basophils                    MAIT cells 
+                            5                           112 
+      Myeloid dendritic cells                 Naive B cells 
+                          162                           311 
+            Naive CD4 T cells             Naive CD8 T cells 
+                          600                           752 
+         Natural killer cells       Non classical monocytes 
+                          320                           189 
+  Non-switched memory B cells            Non-Vd2 gd T cells 
+                          250                           163 
+                 Plasmablasts  Plasmacytoid dendritic cells 
+                            3                            93 
+             Progenitor cells       Switched memory B cells 
+                           36                            81 
+           T regulatory cells Terminal effector CD4 T cells 
+                          163                            39 
+Terminal effector CD8 T cells                     Th1 cells 
+                          115                            97 
+               Th1/Th17 cells                    Th17 cells 
+                          135                           133 
+                    Th2 cells                Vd2 gd T cells 
+                          144                           146 
+                         <NA> 
+                          190 
+ + + + + + +
# add fine labels to SCE
+sce$celltype_fine <- singler_result_fine$pruned.labels
+# plot UMAP with fine labels
+scater::plotUMAP(sce, color_by = "celltype_fine")
+ + +
Warning: Removed 190 rows containing missing values or values outside the scale range
+(`geom_point()`).
+ + +

+ + + +

That’s a pretty messy plot. Mostly that is because there are +lots of cell types here, and not enough colors to represent +them all. The NA cells also got taken off completely, which +is not ideal.

+

One thing we can do is to use some functions from the +tidyverse package forcats, which can +be very handy for dealing with categorical variables like these cell +types.

+

We will use two of these functions in the chunk below: First we will +use fct_collapse to take some of the finer labels that we +might not be as interested in and collapse them into logical groupings +(in this case, the main label that they were part of). +After that, we will use fct_relevel to put the remaining +factor levels in the order we would like them to appear for +plotting.

+ + + +
collapsed_labels <- singler_result_fine$pruned.labels |>
+  forcats::fct_collapse(
+    "Monocytes" = c(
+        "Classical monocytes", 
+        "Intermediate monocytes",   
+        "Non classical monocytes"),
+    "Dendritic cells" = c(
+        "Myeloid dendritic cells",
+        "Plasmacytoid dendritic cells"),
+    "T cells" = c(
+        "MAIT cells",
+        "Non-Vd2 gd T cells",
+        "Vd2 gd T cells"),
+    "Helper T cells" = c(
+        "Th1 cells",
+        "Th1/Th17 cells", 
+        "Th17 cells", 
+        "Th2 cells",
+        "Follicular helper T cells"),
+    "B cells" = c(
+        "Naive B cells",
+        "Switched memory B cells",
+        "Non-switched memory B cells",
+        "Exhausted B cells",
+        "Plasmablasts"      
+    )
+  ) |>
+  # order for plotting
+  forcats::fct_relevel(
+    "Helper T cells",
+    "T regulatory cells",
+    "Naive CD4 T cells",
+    "Terminal effector CD4 T cells",
+    "Naive CD8 T cells",
+    "Central memory CD8 T cells",
+    "Effector memory CD8 T cells",
+    "Terminal effector CD8 T cells",
+    "T cells",
+    "Natural killer cells",
+    "B cells",
+    "Monocytes",
+    "Dendritic cells",
+    "Progenitor cells",
+    "Low-density basophils"
+  )
+ + + +

Now that we have that set up, we can plot using our collapsed and +ordered cell type labels.

+ + + +
sce$celltype_collapsed <- collapsed_labels
+scater::plotUMAP(sce, 
+                 color_by = "celltype_collapsed")
+ + +

+ + + +
+
+

Heatmap of cell types & clusters

+

Let’s look at how the cell type assignments we obtained using +SingleR compare to the clusters that we found using the +unsupervised clustering at the start of this notebook.

+

To do this, we will again use the table() function, but +now with two vectors as input, to build a contingency table of the cell +types and clusters that each cell was classified with.

+ + + +
# create a table of clusters & cell type counts
+type_cluster_tab <- table(sce$celltype_fine, sce$nn_cluster, useNA = "ifany")
+
+# look at the top corner of the results
+type_cluster_tab[1:5, 1:5]
+ + +
                             
+                                 1    2    3    4    5
+  Central memory CD8 T cells    81    0    0    0    0
+  Classical monocytes            0    0 2195  698    0
+  Effector memory CD8 T cells   26    0    0    0    0
+  Exhausted B cells              0    0    0    0    0
+  Follicular helper T cells     93    0    0    0    0
+ + + +

As you can see, this produced a table with rows for each cell type +and columns for each cluster number. The values are the count of cells +for each cluster/cell type combination. However, these raw counts are +not quite what we’ll want for visualization. Since the total number of +cells differs across clusters, we’d like to convert these counts into +the proportions of each cell type in each cluster.

+

We’ll do this by going through the table column by column and +dividing each value by the sum for that cluster. This will give us +normalized values where the values in each column now sum to 1. To do +that, we will use the apply function, which allows us to +operate on a matrix row by row or column by column, applying a function +to each “slice”. Since the function we want to apply is very short, we +will use R’s new (as of version 4.1) anonymous function shorthand: +\(x) ... can be used to define a function that that takes +as input values x (where the ... is where you +would put the expression to calculate). Here we will apply the +expression x/sum(x), which will divide each element of a +vector x by the sum of its values.

+ + + +
# normalize by the number of cells in each cluster (columns)
+type_cluster_tab <- apply(
+  type_cluster_tab, 
+  2, # apply function to columns
+  \(x) x/sum(x) # function to apply
+)
+# print the normalized values
+type_cluster_tab[1:5, 1:5]
+ + +
                             
+                                       1 2         3        4 5
+  Central memory CD8 T cells  0.08617021 0 0.0000000 0.000000 0
+  Classical monocytes         0.00000000 0 0.9825425 0.656015 0
+  Effector memory CD8 T cells 0.02765957 0 0.0000000 0.000000 0
+  Exhausted B cells           0.00000000 0 0.0000000 0.000000 0
+  Follicular helper T cells   0.09893617 0 0.0000000 0.000000 0
+ + + +

Now we can plot these results as a heatmap, using the +pheatmap package. There is a lot of customization we could +do here, but pheatmap (pretty heatmap) has good defaults, +so we won’t spend too much time on it for now.

+ + + +
# plot with pheatmap
+pheatmap::pheatmap(type_cluster_tab)
+ + +

+ + + +

We can see that most of our clusters are indeed defined by a single +cell type, though there are some clusters (e.g., 1 & 9) that have a +number of (related) cell types within them. There are also some places +where single cell types are spread across a few different clusters +(Classical monocytes, for example).

+
+
+

Classifying by clusters

+

While most of the time we will want to classify single cells, +sometimes the sparseness of the data may mean that individual cells do +not provide reliable estimates of cell types.

+

An alternative approach is to classify the clusters as a whole, +assuming that the clusters we have identified represent a single cell +state. If that is the case, then we should be able to combine the data +for all cells across each cluster, then apply our cell typing method to +this group of cells. This is similar to an approach we will return to +later in the context of differential expression.

+

The first step here is to create a new matrix where we sum the counts +across cells that are from the same type according to our clustering. +Because SingleR is a non-parametric approach, we can +perform this step with the raw counts matrix. There are a few different +ways to do this, but we will use the function +DelayedArray::colsum(), which can work directly on the +sparse matrices that are often found in SCE objects. We will provide it +with the matrix we need, and then a vector of the cluster assignments +for each column of the matrix. The function will then sum expression +values for each gene across all of the columns that have that value.

+ + + +
# sum count matrix by cluster
+cluster_mat <- DelayedArray::colsum(counts(sce), sce$nn_cluster)
+# print new dimensions
+dim(cluster_mat)
+ + +
[1] 36601    20
+ + + +

You can see that the resulting matrix still has the same number of +rows we have seen before, but now only has as many columns as the number +of clusters that the cells were assigned to.

+

Now we can apply the same SingleR model to these +results, using the new matrix as input along with the previously trained +model. As there are only 20 clusters to classify, this will be very +quick, and we don’t need to parallelize it!

+ + + +
# run SingleR classification with previously trained model
+singler_cluster <- SingleR::classifySingleR(
+  cluster_mat, # cluster expression matrix
+  singler_finemodel # pre-trained model
+)
+
+# view results
+head(singler_cluster)
+ + +
DataFrame with 6 rows and 4 columns
+                          scores                 labels delta.next
+                        <matrix>            <character>  <numeric>
+1 0.612108:0.431222:0.615571:...         Th1/Th17 cells 0.01424091
+2 0.310607:0.468907:0.306530:...          Naive B cells 0.55307985
+3 0.275063:0.805908:0.308321:...    Classical monocytes 0.35835004
+4 0.276687:0.776140:0.317954:...    Classical monocytes 0.08904561
+5 0.289710:0.708883:0.317147:... Myeloid dendritic ce.. 0.07394915
+6 0.462245:0.395730:0.451555:...          Naive B cells 0.00785926
+           pruned.labels
+             <character>
+1         Th1/Th17 cells
+2          Naive B cells
+3    Classical monocytes
+4    Classical monocytes
+5 Myeloid dendritic ce..
+6          Naive B cells
+ + + +

The result is a fairly small table of results, but we are most +interested in the labels, which we would like to associate with each +cell in our SCE object for visualization. Since the cluster labels are +the row names of that table, we can perform a cute little trick to +assign labels back to each cell based on the name of the cluster that it +was assigned to. (In this case the cluster names are all numbers, but +that might not always be the case.) We’ll select values repeatedly from +the singler_cluster table, using the cluster assignment to +pick a row, and then always picking the pruned.labels +column.

+ + + +
sce$celltype_cluster <- singler_cluster[sce$nn_cluster, "pruned.labels"]
+ + + +

Now we can plot these cluster-based cell type assignments using the +now familiar plotUMAP() function.

+ + + +
scater::plotUMAP(sce, color_by = "celltype_cluster")
+ + +

+ + + +

This sure looks nice and clean, but what have we really done here? We +are assuming that each cluster has only a single cell type, +which is a pretty bold assumption, as we really aren’t sure that the +clusters we created were correct. You may recall that clustering +algorithms are quite sensitive to parameter choice, so a different +parameter choice could quite likely give a different result.

+
+
+

MetaCell approaches

+

As a middle ground between the potentially messy single-cell cell +type assignment and the almost-certainly overconfident cluster-based +assignment above, we can take approach inspired by Baran et al. +(2019) using something they called metacells. The idea is +that we can perform fine-scaled clustering to identify groups of very +similar cells, then sum the counts within those clusters as “metacells” +to use for further analysis. The original paper includes a number of +optimizations to make sure that the metacell clusters have desirable +properties for downstream analysis. We won’t go into that depth here, +but we can apply similar ideas.

+

To begin, we will perform some fine-scale clustering, using a simpler +clustering algorithm: K-means clustering. We will use the same +bluster package, clustering based on the PCA results we +have from earlier, but this algorithm allows us to specify the number of +clusters we want to end up with. We have about 8000 cells, so let’s +cluster those into groups of approximately 80 cells, which works out to +100 clusters. While this is almost certainly more clusters than are +“real” in this dataset, our goal here is not to find differences among +clusters, just to get homogeneous groups of cells.

+ + + +
# perform k-means clustering
+kclusters <- bluster::clusterRows(
+  reducedDim(sce, "PCA"), 
+  bluster::KmeansParam(
+    centers = 100, # the number of clusters 
+    iter.max = 100 # more iterations to be sure of convergence
+  )
+)
+ + + +

Now we can apply exactly the same approach we did when we had the 20 +clusters we had identified with the earlier graph-based clustering.

+ + + +
# create a "metacell" matrix by summing fine-scale clusters
+metacell_mat <- DelayedArray::colsum(counts(sce), kclusters)
+
+# apply SingleR model to metacell matrix
+metacell_singler <- SingleR::classifySingleR(
+  metacell_mat, 
+  singler_finemodel
+)
+
+# apply metacell cell type assignments to individual cells
+sce$celltype_metacell <- metacell_singler[kclusters, "pruned.labels"]
+ + + +

Now we can plot the results as we have done before.

+ + + +
scater::plotUMAP(sce, color_by = "celltype_metacell")
+ + +
Warning: Removed 208 rows containing missing values or values outside the scale range
+(`geom_point()`).
+ + +

+ + + +

What do you think of this plot? Is this more or less useful than the +original cell-based clustering?

+
+
+
+

Save results

+

To save disk space (and time), we won’t write out the whole SCE +object, as we haven’t changed any of the core data there. Instead we +will just write out the cell information table (colData) as +a TSV file.

+ + + +
colData(sce) |>
+  as.data.frame() |>
+  readr::write_tsv(file = cellinfo_file)
+ + + +
+ + +
LS0tCnRpdGxlOiAiQW5ub3RhdGluZyBjZWxsIHR5cGVzIGZyb20gc2NSTkEtc2VxIGRhdGEiCmF1dGhvcjogRGF0YSBMYWIgZm9yIEFMU0YKZGF0ZTogMjAyMwpvdXRwdXQ6CiAgaHRtbF9ub3RlYm9vazogCiAgICB0b2M6IHRydWUKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCiMjIE9iamVjdGl2ZXMKClRoaXMgbm90ZWJvb2sgd2lsbCBkZW1vbnN0cmF0ZSBob3cgdG86CgotIEV4cGxvcmUgZGF0YSBmcm9tIGFudGlib2R5LWRlcml2ZWQgdGFncyAoQ0lURS1zZXEpCi0gQXBwbHkgc2ltcGxlIHJ1bGUtYmFzZWQgY2xhc3NpZmljYXRpb24gdG8gaWRlbnRpZnkgY2VsbCB0eXBlcwotIElkZW50aWZ5IGNlbGwgdHlwZXMgYnkgc2ltaWxhcml0eSB0byByZWZlcmVuY2UgZGF0YXNldHMgd2l0aCBgU2luZ2xlUmAKLSBBcHBseSBgU2luZ2xlUmAgY2xhc3NpZmljYXRpb24gdG8gZ3JvdXBzIG9mIGNlbGxzCgotLS0KCkluIHRoaXMgbm90ZWJvb2ssIHdlIHdpbGwgYXR0ZW1wdCB0byBhbm5vdGF0ZSBjZWxsIHR5cGVzIHRvIGVhY2ggb2YgdGhlIGNlbGxzIGluIGEgZGF0YXNldCwgdXNpbmcgc29tZSBvZiB0aGUgYXV0b21hdGVkIHRvb2xzIHRoYXQgYXJlIGF2YWlsYWJsZSB3aXRoaW4gdGhlIEJpb2NvbmR1Y3RvciB1bml2ZXJzZS4KCk11Y2ggb2YgdGhlIG1hdGVyaWFsIGluIHRoaXMgbm90ZWJvb2sgaXMgZGlyZWN0bHkgaW5zcGlyZWQgYnksIGFuZCBkcmF3cyBoZWF2aWx5IG9uLCBtYXRlcmlhbCBwcmVzZW50ZWQgaW4gdGhlIGJvb2sgW19PcmNoZXN0cmF0aW5nIFNpbmdsZSBDZWxsIEFuYWx5c2lzIHdpdGggQmlvY29uZHVjdG9yX10oaHR0cDovL2Jpb2NvbmR1Y3Rvci5vcmcvYm9va3MvMy4xNi9PU0NBLykuIAoKVGhlIGRhdGEgd2Ugd2lsbCB1c2UgZm9yIHRoaXMgbm90ZWJvb2sgaXMgZGVyaXZlZCBmcm9tIGEgWzEweCBHZW5vbWljcyBkYXRhc2V0IG9mIGh1bWFuIHBlcmlwaGVyYWwgYmxvb2QgbW9ub251Y2xlYXIgY2VsbHMgKFBCTUNzKV0oaHR0cHM6Ly9zb2Z0d2FyZS4xMHhnZW5vbWljcy5jb20vc2luZ2xlLWNlbGwtZ2VuZS1leHByZXNzaW9uL2RhdGFzZXRzLzYuMC4wLzEwa19QQk1Dc19Ub3RhbFNlcV9CXzNwKS4KVGhlc2UgZGF0YSBpbmNsdWRlIGJvdGggc2luZ2xlIGNlbGwgUk5BLXNlcSBjb3VudHMgYW5kIHF1YW50aWZpY2F0aW9uIG9mIGFudGlib2R5LWRlcml2ZWQgdGFncyAoQURUcykgcGVyZm9ybWVkIGJ5IHNlcXVlbmNpbmcgc2hvcnQgRE5BIGJhcmNvZGVzIGF0dGFjaGVkIHRvIHNwZWNpZmljIGFudGlib2RpZXMuIApUaGlzIHR5cGUgb2YgQURUIHNlcXVlbmNpbmcgd2l0aCBzaW5nbGUgY2VsbHMgaXMgY29tbW9ubHkga25vd24gYXMgQ0lURS1zZXEsIGFmdGVyIHRoZSBwcm90b2NvbCBkZXZlbG9wZWQgYnkgW1N0b2Vja2l1cyBfZXQgYWwuXyAoMjAxNyldKGh0dHBzOi8vZG9pLm9yZy8xMC4xMDM4L25tZXRoLjQzODApLiAgClRoZSBhbnRpYm9kaWVzIHVzZWQgaGVyZSBhcmUgdGhlIFtUaGUgVG90YWxTZXHihKItQiBIdW1hbiBUQk5LIENvY2t0YWlsXShodHRwczovL3d3dy5iaW9sZWdlbmQuY29tL2VuLXVzL3Byb2R1Y3RzL3RvdGFsc2VxLWItaHVtYW4tdGJuay1jb2NrdGFpbC0xOTA0MyksIGEgc2V0IG9mIGFudGlib2RpZXMgZGVzaWduZWQgdG8gcmVhY3Qgd2l0aCBpbW11bmUgY2VsbCBzdXJmYWNlIG1hcmtlcnMuCgohW1NpbmdsZS1jZWxsIHJvYWRtYXA6IENlbGwgdHlwZV0oZGlhZ3JhbXMvcm9hZG1hcF9zaW5nbGVfY2VsbHR5cGUucG5nKQoKVGhlIGRhdGEgaGVyZSBoYXZlIGFscmVhZHkgYmVlbiBmaWx0ZXJlZCwgbm9ybWFsaXplZCwgYW5kIGhhZCBkaW1lbnNpb24gcmVkdWN0aW9ucyBjYWxjdWxhdGVkIGZvciB0aGUgc2luZ2xlLWNlbGwgUk5BLXNlcSBkYXRhLgpUaGUgQURUIGRhdGEgaGFzIGFsc28gYmVlbiBzZXBhcmF0ZWx5IGZpbHRlcmVkIGFuZCBub3JtYWxpemVkLgpGb3IgZGV0YWlscyBhYm91dCBob3cgdG8gcGVyZm9ybSB0aGVzZSB0YXNrcyB3aXRoIGRhdGEgdGhhdCBoYXMgYmVlbiBwcm9jZXNzZWQgd2l0aCBDZWxsIFJhbmdlciwgeW91IG1heSB3YW50IHRvIGxvb2sgYXQgdGhlIFsiSW50ZWdyYXRpbmcgd2l0aCBwcm90ZWluIGFidW5kYW5jZSIgY2hhcHRlcl0oaHR0cDovL2Jpb2NvbmR1Y3Rvci5vcmcvYm9va3MvMy4xNi9PU0NBLmFkdmFuY2VkL2ludGVncmF0aW5nLXdpdGgtcHJvdGVpbi1hYnVuZGFuY2UuaHRtbCNzZXR0aW5nLXVwLXRoZS1kYXRhKSBvZiBPU0NBLgoKVGhlIHByb2Nlc3NlZCBnZW5lIGV4cHJlc3Npb24gYW5kIEFEVCBkYXRhIHdlcmUgc2F2ZWQgaW50byBhIGNvbWJpbmVkIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAgKFNDRSkgb2JqZWN0LCBhbmQgd2Ugd2lsbCBzdGFydCB3aXRoIHRoYXQgb2JqZWN0IGZvciBvdXIgZXhwbG9yYXRpb24gaGVyZS4KCiMjIFNldCB1cAoKVG8gc3RhcnQsIHdlIHdpbGwgbG9hZCBzb21lIG9mIHRoZSBsaWJyYXJpZXMgd2Ugd2lsbCBuZWVkIGxhdGVyLCBhbmQgc2V0IGEgcmFuZG9tIG51bWJlciBzZWVkIGZvciByZXByb2R1Y2liaWxpdHkuCgpgYGB7ciBzZXR1cH0KIyBMb2FkIGxpYnJhcmllcwpsaWJyYXJ5KGdncGxvdDIpICMgcGxvdHRpbmcgZnVuY3Rpb25zCmxpYnJhcnkoU2luZ2xlQ2VsbEV4cGVyaW1lbnQpICMgQmlvY29uZHVjdG9yIHNpbmdsZS1jZWxsIGRhdGEgY2xhc3MKCgojIFNldHRpbmcgdGhlIHNlZWQgZm9yIHJlcHJvZHVjaWJpbGl0eQpzZXQuc2VlZCgxMjM0NSkKYGBgCgoKIyMjIERpcmVjdG9yaWVzIGFuZCBmaWxlcwoKQXMgbWVudGlvbmVkLCBvdXIgaW5wdXQgZmlsZSBoZXJlIGlzIGEgc2luZ2xlIG5vcm1hbGl6ZWQgYW5kIHByb2Nlc3NlZCBTQ0Ugb2JqZWN0LCBzdG9yZWQgYXMgYW4gYHJkc2AgZmlsZS4gClRoYXQgc2hvdWxkIGJlIGFsbCB3ZSBuZWVkIHRvIHJlYWQgaW4hCgpPdXIgb3V0cHV0IHdpbGwgYmUgYSB0YWJsZSBvZiBwZXItY2VsbCBpbmZvcm1hdGlvbiwgd2hpY2ggd2lsbCBpbmNsdWRlIHRoZSBjZWxsIHR5cGUgYXNzaWdubWVudHMgd2UgaGF2ZSBtYWRlIHRocm91Z2hvdXQgdGhpcyBub3RlYm9vay4KV2UgYXJlbid0IHBsYW5uaW5nIGFueSBzaWduaWZpY2FudCBtb2RpZmljYXRpb25zIG9mIHRoZSB1bmRlcmx5aW5nIGRhdGEsIHNvIHdlIHdvbid0IGJvdGhlciByZS1zYXZpbmcgdGhlIHdob2xlIFNDRSBvYmplY3QgYXMgYSBuZXcgYC5yZHNgIGZpbGUgdGhpcyB0aW1lLgoKYGBge3IgZmlsZXBhdGhzLCBsaXZlPVRSVUV9CiMgZGlyZWN0b3J5IGZvciB0aGUgaW5wdXQgZGF0YQpkYXRhX2RpciA8LSBmaWxlLnBhdGgoImRhdGEiLCAKICAgICAgICAgICAgICAgICAgICAgICJQQk1DLVRvdGFsU2VxQiIsIAogICAgICAgICAgICAgICAgICAgICAgIm5vcm1hbGl6ZWQiKQoKIyB0aGUgaW5wdXQgZmlsZSBpdHNlbGYKc2NlX2ZpbGUgPC0gZmlsZS5wYXRoKGRhdGFfZGlyLCAKICAgICAgICAgICAgICAgICAgICAgICJQQk1DX1RvdGFsU2VxQl9ub3JtYWxpemVkX3NjZS5yZHMiKQoKIyBBIGRpcmVjdG9yeSB0byBzdG9yZSBvdXRwdXRzCmFuYWx5c2lzX2RpciA8LSBmaWxlLnBhdGgoImFuYWx5c2lzIiwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgIlBCTUMtVG90YWxTZXFCIikKCiMgQ3JlYXRlIGRpcmVjdG9yeSBpZiBpdCBkb2Vzbid0IGV4aXN0CmZzOjpkaXJfY3JlYXRlKGFuYWx5c2lzX2RpcikKCiMgb3V0cHV0IHRhYmxlIHBhdGgKY2VsbGluZm9fZmlsZSA8LSBmaWxlLnBhdGgoYW5hbHlzaXNfZGlyLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgIlBCTUNfVG90YWxTZXFCX2NlbGxpbmZvLnRzdiIpCmBgYAoKCiMjIEV4cGxvcmluZyBhIENJVEUtc2VxIGBTaW5nbGVDZWxsRXhwZXJpbWVudGAKCk5vdyB0aGF0IHRoZSBwcmVsaW1pbmFyeSBzZXR1cCBpcyBvdXQgb2YgdGhlIHdheSwgd2UgY2FuIGdldCBzdGFydGVkLiAKRmlyc3Qgd2Ugd2lsbCByZWFkIGluIHRoZSBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIGZyb20gdGhlIGlucHV0IGZpbGUgd2UgZGVmaW5lZCBlYXJsaWVyLgoKYGBge3IgcmVhZCBTQ0UsIGxpdmU9VFJVRX0KIyByZWFkIGluIHRoZSBTQ0UgZmlsZQpzY2UgPC0gcmVhZHI6OnJlYWRfcmRzKHNjZV9maWxlKQojIHByaW50IGEgc3VtbWFyeSBvZiB0aGUgU0NFCnNjZQpgYGAKClRoaXMgc2hvdWxkIGxvb2sgc2ltaWxhciB0byB0aGUgU0NFIG9iamVjdHMgdGhhdCB3ZSBoYXZlIHNlZW4gYmVmb3JlLCBjb250YWluaW5nIGBjb3VudHNgIGFuZCBgbG9nY291bnRzYCBhc3NheXMgd2hlcmUgZWFjaCBjZWxsIGlzIGEgY29sdW1uIGFuZCBlYWNoIHJvdyBpcyBhIGdlbmUuCldlIGFsc28gaGF2ZSBzb21lIG9mIHRoZSBgcm93RGF0YWAsIGBjb2xEYXRhYCBhbmQgcmVkdWNlZCBkaW1lbnNpb24gbWF0cmljZXMgdGhhdCB3ZSBoYXZlIHNlZW4gYmVmb3JlLgoKQnV0IHdoZXJlIGFyZSB0aGUgZGF0YSBmcm9tIHRoZSBBRFRzPwpXZSB3b3VsZG4ndCBuZWNlc3NhcmlseSB3YW50IHRob3NlIHN0b3JlZCBpbiB0aGUgbWFpbiBkYXRhIG1hdHJpY2VzLCBhcyB0aGUgY2hhcmFjdGVyaXN0aWNzIG9mIEFEVCBiYXJjb2RlIGRhdGEgaXMgZ29pbmcgdG8gYmUgcXVpdGUgZGlmZmVyZW50IGZyb20gZ2VuZSBleHByZXNzaW9uIGRhdGEuCgpUbyBrZWVwIHRoZSBBRFQgZGF0YSBzZXBhcmF0ZSBmcm9tIHRoZSBSTkEgZ2VuZSBleHByZXNzaW9uIGRhdGEsIHdlIGhhdmUgc3BsaXQgdGhpcyBkYXRhIG9mZiBpbnRvIGFuIF9hbHRlcm5hdGl2ZSBleHBlcmltZW50XyAoYGFsdEV4cGApIHNsb3QuCllvdSBjYW4gc2VlIHRoZSBuYW1lIG9mIHRoaXMgYGFsdEV4cGAgb24gdGhlIGxpbmUgYGFsdEV4cE5hbWVzYCBhYm92ZS4gCldlIF9jb3VsZF8gaGF2ZSBtb3JlIHRoYW4gb25lIHR5cGUgb2YgYWx0ZXJuYXRpdmUgZXhwZXJpbWVudCAoc3VjaCBhcyBzcGlrZS1pbiBvciBBVEFDLXNlcSksIGJ1dCBpbiB0aGlzIGNhc2UsIGp1c3QgdGhlIG9uZS4KClRvIGFjY2VzcyB0aGUgY29udGVudHMgb2YgdGhlIGBhbHRFeHBgIHNsb3QsIHdlIGNhbiB1c2UgdGhlIGBhbHRFeHAoKWAgZnVuY3Rpb24uCkxldCdzIGxvb2sgYXQgd2hhdCB3ZSBoYXZlIGluIHRoYXQgc2xvdDoKCmBgYHtyIHZpZXcgYWx0RXhwLCBsaXZlPVRSVUV9CiMgcHJpbnQgYSBzdW1tYXJ5IG9mIHRoZSAnQURUJyBhbHRFeHAKYWx0RXhwKHNjZSwgIkFEVCIpCmBgYAoKSXQgaXMgYW5vdGhlciBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgISAKSW5jZXB0aW9uIQpMZXQncyBsb29rIGF0IHRoYXQgZW1iZWRkZWQgU0NFIG1vcmUgY2xvc2VseS4KClRoZSBmaXJzdCB0aGluZyB0byBub3RlIGlzIHRoYXQgdGhpcyBgYWx0RXhwYCBoYXMgdGhlIHNhbWUgbnVtYmVyIG9mIGNvbHVtbnMgYXMgZGlkIHRoZSBtYWluIFNDRSBvYmplY3QuIApUaG9zZSBjb3JyZXNwb25kZWQgdG8gdGhlIGluZGl2aWR1YWwgY2VsbHMgYmVmb3JlLCBhbmQgc3RpbGwgZG8hCgpUaGVyZSBhcmUgb25seSAxMCByb3dzLCBob3dldmVyLCBhbmQgdGhlc2UgY29ycmVzcG9uZCB0byB0aGUgQURUcyB0aGF0IHdlcmUgYXNzYXllZCBieSB0aGlzIHBhcnRpY3VsYXIgZXhwZXJpbWVudC4gCkp1c3QgYXMgd2UgZGlkIHdpdGggdGhlIGZ1bGwgU0NFLCB3ZSBjYW4gdXNlIGByb3dEYXRhKClgIHRvIHZpZXcgdGhlIHRhYmxlIGNvbnRhaW5pbmcgbWV0YWRhdGEgYXNzb2NpYXRlZCB3aXRoIGVhY2ggb2YgdGhlc2Ugcm93cy4KV2UnbGwgYWRkIHRoZSBgYWx0RXhwKClgIGZ1bmN0aW9uIHRvIHBvaW50IGl0IHRvIHRoZSBlbWJlZGRlZCBvYmplY3Qgd2UgYXJlIGludGVyZXN0ZWQgaW4uIApTaW5jZSB0aGVyZSBpcyBvbmx5IG9uZSBgYWx0RXhwYCwgd2UgZG9uJ3QgbmVlZCB0aGUgc2Vjb25kIChuYW1lKSBhcmd1bWVudCAoYCJBRFQiYCkgdGhhdCB3ZSB1c2VkIGFib3ZlOyB0aGUgZGVmYXVsdCBiZWhhdmlvciBvZiBgYWx0RXhwKClgIGlzIHRvIGp1c3QgZ2l2ZSB1cyB0aGUgZmlyc3QgYGFsdEV4cGAsIGFuZCB0aGF0IGlzIHRoZSBvbmUgKGFuZCBvbmx5KSB0aGF0IHdlIG5lZWQuCgpgYGB7ciBhZHQgcm93cywgbGl2ZT1UUlVFfQojIFdoYXQgcHJvdGVpbnMgd2VyZSBhc3NheWVkPwpyb3dEYXRhKGFsdEV4cChzY2UpKQpgYGAKCllvdSBjYW4gc2VlIGhlcmUgdGhlIG5hbWVzIGFuZCBzeW1ib2xzIG9mIHRoZSB0YWdzIHVzZWQsIGFsb25nIHdpdGggdGhlIGRlc2lnbmF0aW9uIHRoYXQgYWxsIGhhdmUgYW4gIkFudGlib2R5IENhcHR1cmUiIHR5cGUgKGFzIG9wcG9zZWQgdG8gIkdlbmUgRXhwcmVzc2lvbiIgZm9yIHRoZSBSTkEgZGF0YSkuCk9uZSB5b3UgbWlnaHQgbm90ZSBsb29rcyBkaWZmZXJlbnQgaXMgdGhlIGBJZ0cxYCBjb250cm9sLCB3aGljaCBpcyBhY3R1YWxseSBhIG1vdXNlIGFudGlib2R5IHVzZWQgYXMgYSBuZWdhdGl2ZSBjb250cm9sLiAKCgojIyMgQ2x1c3RlcmluZyByZWR1eAoKV2hpbGUgZGltZW5zaW9uIHJlZHVjdGlvbiB3YXMgcGVyZm9ybWVkIG9uIHRoaXMgZGF0YSwgd2UgaGF2ZSBub3QgeWV0IHBlcmZvcm1lZCBhbnkgY2x1c3RlcmluZy4KCkxldCdzIGFzc2lnbiBzb21lIGNsdXN0ZXJzIHRvIG91ciBjZWxscywgdXNpbmcgZ3JhcGgtYmFzZWQgY2x1c3RlcmluZyBhbmQgZGVmYXVsdCBwYXJhbWV0ZXJzLCB0YWtpbmcgYXMgaW5wdXQgdGhlIFBDQSBtYXRyaXggdGhhdCB3YXMgcHJldmlvdXNseSBjYWxjdWxhdGVkLgpOb3RlIHRoYXQgdGhpcyBQQ0EgbWF0cml4IGFuZCB0aGUgVU1BUCBidWlsdCBmcm9tIGl0IHdlcmUgZGVyaXZlZCBmcm9tIHRoZSBnZW5lIGV4cHJlc3Npb24gZGF0YSwgc28gdGhlIGNsdXN0ZXJpbmcgaXMgZ29pbmcgdG8gcmVmbGVjdCB0aGUgZ2VuZSBleHByZXNzaW9uIGRhdGEgb25seS4KV2hpbGUgd2UgaGF2ZSB0aGUgQURUIGRhdGEsIGl0IGlzIF9ub3RfIGJlaW5nIHVzZWQgZm9yIHRoaXMgc3RhZ2Ugb2YgdGhlIGFuYWx5c2lzLgoKYGBge3IgY2x1c3RlciBjZWxscywgbGl2ZT1UUlVFfQojIHBlcmZvcm0gY2x1c3RlcmluZwpubl9jbHVzdGVycyA8LSBibHVzdGVyOjpjbHVzdGVyUm93cygKICAjIFBDQSBpbnB1dAogIHJlZHVjZWREaW0oc2NlLCAiUENBIiksIAogICMgZ3JhcGggY2x1c3RlcmluZyAmIHBhcmFtZXRlcnMKICBibHVzdGVyOjpOTkdyYXBoUGFyYW0oKQopCgojIGFkZCBjbHVzdGVycyB0byBjb2xEYXRhCnNjZSRubl9jbHVzdGVyIDwtIG5uX2NsdXN0ZXJzCmBgYAoKTm93IHdlIGNhbiBwbG90IHRoZSBjbHVzdGVycyB3ZSBoYXZlIGlkZW50aWZpZWQgd2l0aCBgc2NhdGVyOjpwbG90VU1BUCgpYC4gClRoaXMgaXMgYSBzaG9ydGN1dCBmb3IgYHNjYXRlcjo6cGxvdFJlZHVjZWREaW0oZGltcmVkID0gIlVNQVAiLCAuLi4pYCwgd2hpY2ggY2FuIHNhdmUgdXMgYSBsb3Qgb2YgdHlwaW5nIGFzIHdlIGRvIHRoaXMgcmVwZWF0ZWRseSEKCmBgYHtyIHBsb3QgY2x1c3RlcnN9CiMgcGxvdCBjbHVzdGVycwpzY2F0ZXI6OnBsb3RVTUFQKHNjZSwgY29sb3JfYnkgPSAibm5fY2x1c3RlciIpICsgCiAgIyByZW5hbWUgdGhlIGxlZ2VuZAogIGd1aWRlcyhjb2xvciA9IGd1aWRlX2xlZ2VuZCh0aXRsZSA9ICJDbHVzdGVyIikpCmBgYApCdXQgd2hhdCBhcmUgdGhlc2UgY2x1c3RlcnMsIHJlYWxseT8gCkRvIHRoZXkgY29ycmVzcG9uZCB0byBwYXJ0aWN1bGFyIGNlbGwgdHlwZXMgdGhhdCB3ZSBhcmUgaW50ZXJlc3RlZCBpbj8KCkRvZXMgaXQgYm90aGVyIHlvdSB0aGF0IHdlIGp1c3QgdXNlZCB0aGUgZGVmYXVsdCBuZWFyZXN0LW5laWdoYm9yIGdyYXBoIGNsdXN0ZXJpbmcgcGFyYW1ldGVycz8KRG8geW91IGtub3cgd2hhdCB0aG9zZSB3ZXJlPwoKIyMgSW52ZXN0aWdhdGluZyBjZWxsIHR5cGVzCgojIyMgVXNpbmcgQURUIGRhdGEKClRoZSBmaXJzdCB3YXkgd2Ugd2lsbCBpZGVudGlmeSBjZWxsIHR5cGVzIG9mIGluZGl2aWR1YWwgY2VsbHMgaXMgdG8gdXNlIHRoZSBBRFQgbm9ybWFsaXplZCBjb3VudHMuClRoZXNlIGFudGlib2R5IG1hcmtlcnMgd2VyZSAoaG9wZWZ1bGx5KSBjaG9zZW4gZm9yIHRoZWlyIHJlbGV2YW5jZSB0byB0aGUgc2VxdWVuY2VkIGNlbGwgcG9wdWxhdGlvbi4KClRoZSBmaXJzdCBtYXJrZXIgd2Ugd2lsbCBsb29rIGF0IGlzIGBDRDNgLCB3aGljaCBpcyBhIHByb3RlaW4gY29tcGxleCB0aGF0IGlzIGZvdW5kIG9uIHRoZSBzdXJmYWNlIG9mIFQgY2VsbHMuCldlIGNhbiBhZ2FpbiB1c2UgdGhlIGBwbG90VU1BUCgpYCBmdW5jdGlvbiB0byBjb2xvciBjZWxscyBieSBgQ0QzYCBBRFQgbGV2ZWxzLiAKCk5vdGUgdGhhdCB0aGlzIGZ1bmN0aW9uIGNhbiBwbG90IGRhdGEgZnJvbSB0aGUgYGNvbERhdGFgIHRhYmxlIChhcyB3ZSB1c2VkIGl0IGFib3ZlIHdoZW4gcGxvdHRpbmcgY2x1c3RlcnMpLCBpbiB0aGUgbWFpbiBnZW5lIGV4cHJlc3Npb24gbWF0cml4IChhcyB3ZSB1c2VkIGl0IGluIHRoZSBwcmV2aW91cyBub3RlYm9vayksICpBTkQqIGluIGBhbHRFeHBgIHRhYmxlcyBhbmQgbWF0cmljZXMhClNvIHRvIGNvbG9yIGJ5IHRoZSBBRFQgbGV2ZWxzIChhcyBub3JtYWxpemVkIGluIHRoZSBgbG9nY291bnRzYCBtYXRyaXgpIHdlIG9ubHkgbmVlZCB0byBwcm92aWRlIHRoZSB0YWcgbmFtZSB0aGF0IHdlIHdhbnQgdG8gcGxvdCBpbiB0aGUgYGNvbG9yX2J5YCBhcmd1bWVudC4KCmBgYHtyIHBsb3QgQ0QzLCBsaXZlPVRSVUV9CiMgcGxvdCBDRDMgZXhwcmVzc2lvbgpzY2F0ZXI6OnBsb3RVTUFQKHNjZSwgY29sb3JfYnkgPSAiQ0QzIikKYGBgCgpJdCBhcHBlYXJzIHRoYXQgd2UgaGF2ZSBhIG51bWJlciBvZiBwb3RlbnRpYWwgVCBjZWxscyBkb3duIGluIHRoZSBsb3dlciBsZWZ0IQoKTGV0J3MgbG9vayBhdCBhIGNvdXBsZSBvZiBvdGhlciBtYXJrZXJzIHRvIHRyeSB0byBicmVhayB0aG9zZSB1cCBtb3JlIHNwZWNpZmljYWxseS4KClR3byBvdGhlciBtYXJrZXJzIG9mIHJlbGV2YW5jZSB0byB0aGUgVCBjZWxscyBhcmUgYENENGAgYW5kIGBDRDhgLgpUaGUgYENENGAgY29tcGxleCBpcyBwcmVzZW50IGluIGhlbHBlciBUIGNlbGxzIChoZW5jZSB0aGVpciBvdGhlciBjb21tb24gbmFtZSwgQ0Q0KyBUIGNlbGxzKS4KQnkgY29udHJhc3QsIHRoZSBgQ0Q4YCBjb21wbGV4IGlzIGZvdW5kIG9uIGtpbGxlciBUIGNlbGxzIChDRDgrIGNlbGxzKS4KCkxldCdzIHBsb3QgdGhlIEFEVCByZXN1bHRzIGZvciB0aG9zZSB0d28gbWFya2VycyBhcyB3ZWxsIGJlbG93OgoKYGBge3IgcGxvdCBDRDQsIGxpdmU9VFJVRX0KIyBwbG90IENENCBtYXJrZXIKc2NhdGVyOjpwbG90VU1BUChzY2UsIAogICAgICAgICAgICAgICAgIGNvbG9yX2J5ID0gIkNENCIpCmBgYAoKYGBge3IgcGxvdCBDRDgsIGxpdmU9VFJVRX0KIyBwbG90IENEOCBtYXJrZXIKc2NhdGVyOjpwbG90VU1BUChzY2UsIAogICAgICAgICAgICAgICAgIGNvbG9yX2J5ID0gIkNEOCIpCmBgYAoKCiMjIyBSdWxlLWJhc2VkIGNsYXNzaWZpY2F0aW9uCgpQbG90dGluZyB0aGUgbGV2ZWxzIG9mIHRoZSBBRFRzIHByb3ZpZGVzIGEgbmljZSB2aXN1YWwgcmVwcmVzZW50YXRpb24sIGJ1dCB3aGF0IHdlIHJlYWxseSB3YW50IHRvIGRvIGlzIHRvIHR1cm4gdGhlc2UgdmFsdWVzIGludG8gc3BlY2lmaWMgY2VsbC10eXBlIGFzc2lnbm1lbnRzIGZvciBlYWNoIGNlbGwuClN1Y2ggY2xhc3NpZmljYXRpb24gY291bGQgYmUgY29uc2lkZXJlZCBhcyBhbmFsb2dvdXMgdG8gYSBjZWxsLXNvcnRlciBhc3NheSwgd2hlcmUgd2Ugd291bGQgc2V0IHVwIHNvbWUgcnVsZXMgdG8gbG9vayBhdCBhIGZldyBtYXJrZXJzIGZvciBlYWNoIGNlbGwgYW5kIHVzZSB0aG9zZSB0byBhc3NpZ24gYSBjZWxsIHR5cGUuClRoZSBzaW1wbGVzdCB0eXBlIG9mIHJ1bGUgbWlnaHQgYmUgb25lIHdoZXJlIHdlIHVzZSBhIHRocmVzaG9sZCB0byBjYWxsIGEgbWFya2VyIGFzIHByZXNlbnQgb3IgYWJzZW50LCBhbmQgdGhlbiB1c2UgdGhlIHByZXNlbmNlIG9mIGEgbWFya2VyIHRvIGluZGljYXRlIGEgc3BlY2lmaWMgY2VsbCB0eXBlLgoKVG8gZG8gdGhpcywgd2Ugd2lsbCBuZWVkIHRvIG1ha2Ugc29tZSBkZWNpc2lvbnMsIHN1Y2ggYXMgdGhlIHRocmVzaG9sZHMgd2Ugc2hvdWxkIHVzZSB0byBkZXRlcm1pbmUgd2hldGhlciBhIGNlbGwgaXMgb3IgaXMgbm90IGV4cHJlc3NpbmcgYSBwYXJ0aWN1bGFyIG1hcmtlci4gCkluIGdlbmVyYWwsIG1hcmtlcnMgdGhhdCBhcmUgdXNlZnVsIGZvciB0aGlzIGNlbGwtdHlwaW5nIGFwcHJvYWNoIHdpbGwgaGF2ZSBhIGJpbW9kYWwgZGlzdHJpYnV0aW9uIG9mIGV4cHJlc3Npb24gbGV2ZWxzIHdoaWNoIGNhbiBiZSB1c2VkIHRvIHNlcGFyYXRlIHRoZSBwb3B1bGF0aW9uIGludG8gdHdvIGdyb3VwcyBvZiBjZWxscy4KT25lIGdyb3VwIG9mIGNlbGxzIHdpbGwgaGF2ZSBvbmx5IGEgYmFja2dyb3VuZCBsZXZlbCBzaWduYWwgZm9yIGVhY2ggbWFya2VyIChkdWUgdG8gbm9uLXNwZWNpZmljIGJpbmRpbmcgb3Igb3RoZXIgZmFjdG9ycyksIHdoaWxlIHRoZSBvdGhlciBncm91cCwgdGhvc2UgdGhhdCBleHByZXNzIHRoZSBwcm90ZWluLCB3aWxsIGhhdmUgYSBtdWNoIGhpZ2hlciBsZXZlbCBvZiBiaW5kaW5nIGFuZCBoaWdoZXIgY291bnRzLgoKVG8gYXNzZXNzIHdoZXRoZXIgdGhlIEFEVHMgd2UgaGF2ZSBjaG9zZW4gaGF2ZSBhIHVzZWZ1bCBkaXN0cmlidXRpb24gb2YgZXhwcmVzc2lvbiB2YWx1ZXMsIGFuZCB0byBpZGVudGlmeSB0aHJlc2hvbGRzIHdlIG1pZ2h0IHVzZSwgd2Ugd291bGQgbGlrZSB0byBwbG90IGVhY2ggQURUIHRhZy4KVG8gZG8gdGhpcywgd2Ugd2lsbCBwdWxsIG91dCB0aGUgZXhwcmVzc2lvbiB2YWx1ZXMgZm9yIHRoZXNlIG1hcmtlcnMgZnJvbSB0aGUgU0NFIG9iamVjdCBhbmQgZG8gc29tZSBkYXRhIHdyYW5nbGluZy4gCgpXZSBhcmUgaW50ZXJlc3RlZCBpbiB0aGUgbm9ybWFsaXplZCBjb3VudHMgZm9yIHRoZSBBRFQgdGFncywgd2hpY2ggYXJlIHN0b3JlZCBpbiB0aGUgYGxvZ2NvdW50c2AgYXNzYXkgb2YgdGhlIGBhbHRFeHBgLgpJZiB5b3UgcmVjYWxsLCB0aGlzIG1hdHJpeCBpcyBzdG9yZWQgd2l0aCB0aGUgY29sdW1ucyBhcyBjZWxscyBhbmQgcm93cyBhcyBtYXJrZXJzLCBidXQgd2UgcmVhbGx5IHdhbnQgaXQgd2l0aCBlYWNoIHJvdyBhIGNlbGwgYW5kIGVhY2ggY29sdW1uIGEgbWFya2VyLiAKU28gd2Ugd2lsbCBmaXJzdCB0cmFuc3Bvc2UgdGhlIGRhdGEsIHRoZW4gY29udmVydCBpdCB0byBhIGRhdGEgZnJhbWUgZm9yIG91ciBuZXh0IHN0ZXBzLgpCZWNhdXNlIHRoZSBTQ0Ugb2JqZWN0IHN0b3JlcyB0aGUgYXNzYXkgZGF0YSBtYXRyaWNlcyBpbiBhIHNwZWNpYWxpemVkIGZvcm1hdCwgd2UgaGF2ZSB0byBkbyBvbmUgZXh0cmEgc3RlcCBjb252ZXJ0IGl0IGZpcnN0IHRvIGEgInJlZ3VsYXIiIFIgbWF0cml4IG9yIFIgd29uJ3Qga25vdyBob3cgdG8gY29udmVydCBpdCB0byBhIGRhdGEgZnJhbWUuCgpgYGB7ciBleHRyYWN0IEFEVH0KIyBjb252ZXJ0IGxvZ2NvdW50cyBkYXRhIHRvIGEgZGF0YSBmcmFtZQphZHRfZGYgPC0gbG9nY291bnRzKGFsdEV4cChzY2UpKSB8PgogIHQoKSB8PiAjIHRyYW5zcG9zZQogIGFzLm1hdHJpeCgpIHw+ICMgY29udmVydCB0byBtYXRyaXgKICBhcy5kYXRhLmZyYW1lKCkgIyBjb252ZXJ0IHRvIGRhdGEgZnJhbWUKCiMgdmlldyB0aGUgZGF0YSBmcmFtZQpoZWFkKGFkdF9kZikKYGBgCgpJZiB3ZSBqdXN0IHdhbnRlZCB0byBwbG90IG9uZSBvZiB0aGVzZSB0YWdzLCB3ZSBjb3VsZCBkbyBzbyByaWdodCBhd2F5LCBidXQgd2l0aCBhIGJpdCBtb3JlIGRhdGEgd3JhbmdsaW5nLCB3ZSBjYW4gY29udmVydCB0aGVzZSByZXN1bHRzIGludG8gYSAidGlkaWVyIiBmb3JtYXQsIHRoYXQgd2lsbCBhbGxvdyB1cyB0byB0YWtlIGZ1bGwgYWR2YW50YWdlIG9mIGB0aWR5dmVyc2VgIHRvb2xzIQpJbiBwYXJ0aWN1bGFyLCBpdCB3aWxsIGxldCB1cyBwbG90IHRoZW0gYWxsIGF0IG9uY2Ugd2l0aCBgZ2dwbG90MmAgZmFjZXRpbmcuCgpSaWdodCBub3cgdGhlIGRhdGEgaXMgaW4gYSAid2lkZSIgZm9ybWF0LCBzdWNoIHRoYXQgZWFjaCBjb2x1bW4gaXMgYSBkaWZmZXJlbnQgdGFnLiAKQnV0IHRoZSBkYXRhIGluIGFsbCBvZiB0aGUgY29sdW1ucyBpcyB0aGUgc2FtZSB0eXBlLCBhbmQgbWVhc3VyZXMgc29tZXRoaW5nIHNpbWlsYXI6IHRoZSBub3JtYWxpemVkIGNvdW50IG9mIGFuIEFEVC4KT25lIGNvdWxkIGV2ZW4gYXJndWUgdGhhdCBlYWNoIHJvdyBjb250YWlucyAxMCBkaWZmZXJlbnQgb2JzZXJ2YXRpb25zLCB3aGVyZSB0aGUgInRpZHkiIGRhdGEgaWRlYWwsIGFzIGVzcG91c2VkIGJ5IFtXaWNraGFtICgyMDE0KV0oaHR0cHM6Ly9kb2kub3JnLzEwLjE4NjM3L2pzcy52MDU5LmkxMCksIHJlcXVpcmVzIGEgc2luZ2xlIG9ic2VydmF0aW9uIHBlciByb3csIGEgImxvbmciIGZvcm1hdC4KVGhpcyBsb25nIGZvcm1hdCB3aWxsIGhhdmUgb25lIGNvbHVtbiB0aGF0IHRlbGxzIHVzIHdoaWNoIEFEVCB3YXMgbWVhc3VyZWQgYW5kIGEgc2Vjb25kIGNvbHVtbiB3aXRoIHRoZSBtZWFzdXJlbWVudCB2YWx1ZSBpdHNlbGYuCgpXZSBjYW4gcGVyZm9ybSB0aGlzIGNvbnZlcnNpb24gdXNpbmcgdGhlIFtgdGlkeXI6OnBpdm90X2xvbmdlcigpYF0oaHR0cHM6Ly90aWR5ci50aWR5dmVyc2Uub3JnL2FydGljbGVzL3Bpdm90Lmh0bWwpIGZ1bmN0aW9uLAp3aGljaCBhbGxvd3MgdXMgdG8gY29udmVydCBvdXIgZGF0YSBmcmFtZSB3aXRoIG9uZSBjb2x1bW4gcGVyIHRhZyBpbnRvIGEgZGF0YSBmcmFtZSB3aXRoIHNlcGFyYXRlIGNvbHVtbnMgZm9yIHRoZSB0YWcgaWQgKGBBRFRgKSBhbmQgdGhlIGV4cHJlc3Npb24gdmFsdWUgKGBsb2djb3VudGApLgpGb2xsb3dpbmcgY29udmVyc2lvbiwgd2Ugd2lsbCBmaWx0ZXIgdG8ganVzdCB0aGUgQURUcyB0aGF0IHdlIGNhcmUgYWJvdXQuCgpgYGB7ciBwaXZvdCBsb25nZXJ9CmFkdF9kZl9sb25nIDwtIGFkdF9kZiB8PgogICMgcGl2b3QgdG8gbG9uZyBmb3JtYXQKICB0aWR5cjo6cGl2b3RfbG9uZ2VyKAogICAgZXZlcnl0aGluZygpLCAjIHVzZSBhbGwgY29sdW1ucwogICAgbmFtZXNfdG8gPSAiQURUIiwgIyBjb252ZXJ0IHJvdyBuYW1lcyB0byBhIGNvbHVtbiBjYWxsZWQgIkFEVCIKICAgIHZhbHVlc190byA9ICJsb2djb3VudCIgIyBuYW1lIHRoZSB2YWx1ZSBjb2x1bW4gImxvZ2NvdW50IgogICkgfD4KICAjIGZpbHRlciB0byB0YWdzIHdlIGFyZSBpbnRlcmVzdGVkIGluCiAgZHBseXI6OmZpbHRlcihBRFQgJWluJSBjKCJDRDMiLCAiQ0Q0IiwgIkNEOCIpKQoKIyBsb29rIGF0IHRoZSByZXN1bHRpbmcgZGYKaGVhZChhZHRfZGZfbG9uZykKYGBgCgpOb3cgd2UgY2FuIG1ha2UgYSBkZW5zaXR5IHBsb3Qgd2l0aCBgZ2dwbG90MmAgZm9yIGFsbCB0aHJlZSBBRFRzIHdlIGFyZSBpbnRlcmVzdGVkIGluIGF0IG9uY2UuCgpgYGB7ciBwbG90IEFEVHMsIGxpdmU9VFJVRX0KIyBwbG90IGxvZ2NvdW50cyBieSBBRFQKZ2dwbG90KGFkdF9kZl9sb25nLCBhZXMoeCA9IGxvZ2NvdW50LCBmaWxsID0gQURUKSkgKyAKICBnZW9tX2RlbnNpdHkoKSArICMgZGVuc2l0eSBwbG90CiAgZmFjZXRfZ3JpZChyb3dzID0gdmFycyhBRFQpKSArICMgZmFjZXQgYnkgQURUCiAgdGhlbWVfYncoKSArICMgbmljZXIgdGhlbWUKICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpICMgbm8gbGVnZW5kIG5lZWRlZApgYGAKClRoZXNlIGxvb2sgcHJldHR5IGdvb2QhCkVhY2ggb2YgdGhlc2UgbWFya2VycyBoYXMgYSBiaW1vZGFsIGRpc3RyaWJ1dGlvbjogQSBsb3dlciBwZWFrIGNvbnNpc3Rpbmcgb2YgY2VsbHMgdGhhdCBkbyBub3QgZXhwcmVzcyB0aGUgcHJvdGVpbiBidXQgd2hpY2ggc3RpbGwgaGF2ZSBhIGJhY2tncm91bmQgbGV2ZWwgb2YgYW50aWJvZHkgYmluZGluZywgYW5kIGFuIHVwcGVyIHBlYWsgb2YgY2VsbHMgdGhhdCBkbyBleHByZXNzIHRoZSBwcm90ZWluIG9mIGludGVyZXN0LgpUaGUgYmFja2dyb3VuZCBsZXZlbCBkb2VzIHZhcnkgYnkgYW50aWJvZHkgbWFya2VyLCBzbyB3ZSB3aWxsIG5lZWQgYSBkaWZmZXJlbnQgdGhyZXNob2xkIHZhbHVlIGZvciBlYWNoIG9uZS4KCldlIGNhbiBub3cgdXNlIHRoZSB2YWx1ZXMgZnJvbSB0aGVzZSBwbG90cyB0byBjb25zdHJ1Y3QgYSBzZXQgb2YgcnVsZXMgdG8gY2xhc3NpZnkgdGhlIFQgY2VsbHMuIApXZSB3aWxsIGRvIHRoaXMgdXNpbmcgdGhlICJ3aWRlIiBkYXRhIGZyYW1lIGZyb20gZWFybGllci4gCgpUaGUgdGhyZXNob2xkcyB3ZSBhcmUgdXNpbmcgaGVyZSB3ZXJlIGlkZW50aWZpZWQganVzdCAiYnkgZXllIiwgc28gdGhpcyBpcyBub3QgYSBwYXJ0aWN1bGFybHkgcHJpbmNpcGxlZCBtZXRob2Qgb2YgY2VsbCB0eXBlIGFzc2lnbm1lbnQsIGJ1dCBpdCBjYW4gYmUgZmFpcmx5IGVmZmVjdGl2ZS4KSGVyZSB3ZSBhcmUgYXNzaWduaW5nIG9ubHkgdGhyZWUgY2VsbCB0eXBlczsgY2VsbHMgdGhhdCBkbyBub3QgZml0IGFueSBvZiB0aGVzZSBjcml0ZXJpYSB3aWxsIGJlIHNldCBhcyBgTkFgLgoKYGBge3IgdGhyZXNob2xkIGNlbGx0eXBlc30KIyBhZGQgY2VsbCB0eXBlIGNvbHVtbiBieSB0aHJlc2hvbGRpbmcKYWR0X2RmIDwtIGFkdF9kZiB8PgogIGRwbHlyOjptdXRhdGUoCiAgICBjZWxsdHlwZSA9IGRwbHlyOjpjYXNlX3doZW4oCiAgICAgIENEMyA+IDYuNyAmIENENCA+IDggfiAiQ0Q0KyBULWNlbGwiLAogICAgICBDRDMgPiA2LjcgJiBDRDggPiA2IH4gIkNEOCsgVC1jZWxsIiwKICAgICAgQ0QzID4gNi43IH4gIlQtY2VsbCIKICAgICkKICApCgphZHRfZGYKYGBgCgpOb3cgd2Ugd2lsbCB3YW50IHRvIGFkZCB0aGUgY2VsbCB0eXBlcyB3ZSBoYXZlIGFzc2lnbmVkIGJhY2sgdG8gb3VyIG9yaWdpbmFsIFNDRSBvYmplY3QuCldlIGNhbiBkbyB0aGF0IGJ5IGRlZmluaW5nIGEgbmV3IGNvbHVtbiBuYW1lLCBgdGhyZXNob2xkX2NlbGx0eXBlYCB0aGF0IHdpbGwgYmUgYWRkZWQgdG8gdGhlIGBjb2xEYXRhYCBvYmplY3QuCkNyZWF0aW5nIGFuZCBhc3NpZ25pbmcgdmFsdWVzIHRvIHRoaXMgY29sdW1uIGNhbiBiZSBkb25lIHdpdGggdGhlIGAkYCBzaG9ydGN1dCwgYW5kIHRoZW4gd2UgY2FuIHBsb3Qgb3VyIHJlc3VsdHMgd2l0aCB0aGUgYHBsb3RVTUFQKClgIGZ1bmN0aW9uIGFzIGJlZm9yZS4KCmBgYHtyIHBsb3QgdGhyZXNob2xkc30Kc2NlJHRocmVzaG9sZF9jZWxsdHlwZSA8LSBhZHRfZGYkY2VsbHR5cGUKc2NhdGVyOjpwbG90VU1BUChzY2UsIAogICAgICAgICAgICAgICAgIGNvbG9yX2J5ID0gInRocmVzaG9sZF9jZWxsdHlwZSIpICsgCiAgZ3VpZGVzKGNvbG9yID0gZ3VpZGVfbGVnZW5kKHRpdGxlID0gIkNlbGwgdHlwZSIpKQpgYGAKCkhvdyBkaWQgd2UgZG8/IAoKTm90ZSB0aGF0IHdoaWxlIHdlIGFwcGxpZWQgdGhpcyB0ZWNobmlxdWUgdG8gYXNzaWduIGNlbGwgdHlwZXMgdXNpbmcgdGhlIEFEVCBkYXRhLCB3ZSBjb3VsZCB1c2UgdGhlIHNhbWUgdHlwZSBvZiBwcm9jZWR1cmUgdXNpbmcgZ2VuZSBleHByZXNzaW9uIGRhdGEgYWxvbmUsIG9yIGEgY29tYmluYXRpb24gb2YgZ2VuZSBleHByZXNzaW9uIGRhdGEgYW5kIHRhZyBkYXRhLgoKSG93ZXZlciwgd2hhdCB3ZSBkaWQgaGVyZSB3YXMgdmVyeSBhZC1ob2MgYW5kIHF1aXRlIG1hbnVhbCEKV2UgZGlkbid0IGNhbGN1bGF0ZSBhbnkgc3RhdGlzdGljcywgYW5kIHdlIGhhZCB0byBsb29rIGF0IGV2ZXJ5IHRhZyB3ZSB3ZXJlIGludGVyZXN0ZWQgaW4gdG8gcGljayB0aHJlc2hvbGRzLgpBIGRpZmZlcmVudCBkYXRhc2V0IG1pZ2h0IGhhdmUgZGlmZmVyZW50IGJhY2tncm91bmQgbGV2ZWxzLCB3aGljaCB3b3VsZCByZXF1aXJlIGRpZmZlcmVudCB0aHJlc2hvbGRzLiAKCldoaWxlIHRoaXMgdGVjaG5pcXVlIG1pZ2h0IGJlIGdvb2QgZm9yIHNvbWUgc2ltcGxlIGV4cGVyaW1lbnRzLCBhbmQgY2FuIGJlIHVzZWZ1bCBmb3IgbWFudWFsIGN1cmF0aW9uLCBpdCBtaWdodCBub3QgdHJhbnNsYXRlIHdlbGwgdG8gbW9yZSBjb21wbGV4IGRhdGFzZXRzIHdpdGggbXVsdGlwbGUgc2FtcGxlcy4KV2UgYWxzbyBsb29rZWQgYXQgZWFjaCBtYXJrZXIgc2VwYXJhdGVseSwgd2hpY2ggbWlnaHQgbm90IGJlIHRoZSBtb3N0IGVmZmljaWVudCBvciByb2J1c3QgbWV0aG9kIG9mIGFuYWx5c2lzLgoKRm9yIGEgbW9yZSBwcmluY2lwbGVkIGFwcHJvYWNoIHRoYXQgYWxsb3dzIGlkZW50aWZpY2F0aW9uIG9mIGNlbGwgdHlwZXMgYnkgbG9va2luZyBhdCB0aGUgZXhwcmVzc2lvbiBvZiBzZXRzIG9mIGdlbmVzIHRoYXQgYXJlIGtub3duIHRvIGNoYXJhY3Rlcml6ZSBlYWNoIGNlbGwgdHlwZSwgeW91IG1pZ2h0IGxvb2sgYXQgdGhlIFtgQVVDZWxsYCBwYWNrYWdlXShodHRwczovL2Jpb2NvbmR1Y3Rvci5vcmcvcGFja2FnZXMvMy4xNi9iaW9jL2h0bWwvQVVDZWxsLmh0bWwpLgpGb3IgbW9yZSBvbiB0aGF0IG1ldGhvZCwgdGhlIE9TQ0Egc2VjdGlvbiBbQXNzaWduaW5nIGNlbGwgbGFiZWxzIGZyb20gZ2VuZSBzZXRzXShodHRwOi8vYmlvY29uZHVjdG9yLm9yZy9ib29rcy8zLjE2L09TQ0EuYmFzaWMvY2VsbC10eXBlLWFubm90YXRpb24uaHRtbCNhc3NpZ25pbmctY2VsbC1sYWJlbHMtZnJvbS1nZW5lLXNldHMpIGlzIGEgdmVyeSBnb29kIHJlZmVyZW5jZS4KCgojIyBDZWxsIHR5cGUgYW5ub3RhdGlvbiB3aXRoIGBTaW5nbGVSYAoKQW4gYWx0ZXJuYXRpdmUgYXBwcm9hY2ggdG8gdXNpbmcga25vd24gbWFya2VyIGdlbmVzIGZvciBjbGFzc2lmaWNhdGlvbiBpcyB0byBpbnN0ZWFkIGNsYXNzaWZ5IGNlbGxzIGJ5IGNvbXBhcmluZyB0aGVtIHRvIGEgcmVmZXJlbmNlIGV4cHJlc3Npb24gZGF0YXNldC4KVG8gZG8gdGhpcywgd2Ugd2lsbCBmaW5kIGEgd2VsbC1jdXJhdGVkIGdlbmUgZXhwcmVzc2lvbiBkYXRhc2V0IHRoYXQgY29udGFpbnMgc2FtcGxlcyB3aXRoIGtub3duIGNlbGwgdHlwZXMuCldlIGNhbiB0aGVuIHRyYWluIGEgbW9kZWwgYmFzZWQgb24gdGhpcyBkYXRhc2V0IGFuZCBsb29rIGF0IGVhY2ggb2YgdGhlIGNlbGxzIGluIG91ciBuZXcgZGF0YXNldCB0byBkZXRlcm1pbmUgd2hpY2ggKGlmIGFueSkgb2YgdGhlIGtub3duIGNlbGwgdHlwZXMgaGFzIHRoZSBtb3N0IHNpbWlsYXIgZXhwcmVzc2lvbiBwYXR0ZXJuLgpUaGUgZGV0YWlscyBvZiBob3cgc3VjaCBhIG1vZGVsIG1heSBiZSBjb25zdHJ1Y3RlZCBhbmQgdHJhaW5lZCB3aWxsIHZhcnkgYnkgdGhlIHNwZWNpZmljIG1ldGhvZCwgYnV0IHRoaXMgb3ZlcmFsbCBhcHByb2FjaCBpcyB3aWRlbHkgYXBwbGllZC4gCgpGb3IgdGhpcyBzZWN0aW9uLCB3ZSB3aWxsIGZvY3VzIG9uIHRoZSBgU2luZ2xlUmAgcGFja2FnZSBhbmQgaXRzIG1ldGhvZHMsIHdoaWNoIGFyZSBkZXNjcmliZWQgaW4gZGV0YWlsIGluIFtfVGhlIFNpbmdsZVIgQm9va19dKGh0dHBzOi8vYmlvY29uZHVjdG9yLm9yZy9ib29rcy8zLjE2L1NpbmdsZVJCb29rLykuCgojIyMgUmVmZXJlbmNlIGRhdGFzZXRzCgpTZWxlY3RpbmcgYSByZWZlcmVuY2UgZGF0YXNldCBpcyBvbmUgb2YgdGhlIG1vcmUgY3JpdGljYWwgc3RlcHMgZm9yIHRoaXMgZW50ZXJwcmlzZS4KQXQgdGhlIG1vc3QgYmFzaWMgbGV2ZWwsIGlmIHRoZSByZWZlcmVuY2UgZGF0YXNldCBkb2VzIG5vdCBpbmNsdWRlIHRoZSB0eXBlcyBvZiBjZWxscyB0aGF0IHdlIGV4cGVjdCB0byBzZWUgaW4gb3VyIHNhbXBsZSwgaXQgd29uJ3QgYmUgdXNlZnVsLgpTbyB3ZSB3aWxsIHdhbnQgYSByZWZlcmVuY2UgZGF0YXNldCB0aGF0IGhhcyBhcyBtYW55IGFzIHBvc3NpYmxlIG9mIHRoZSBjZWxsIHR5cGVzIHRoYXQgd2UgZXhwZWN0IHRvIGZpbmQgaW4gb3VyIGRhdGFzZXQsIGF0IGEgbGV2ZWwgb2YgZ3JhbnVsYXJpdHkgdGhhdCBhbGlnbnMgd2l0aCBvdXIgZ29hbHMuCgpGb3IgYFNpbmdsZVJgIHRoYXQgcmVmZXJlbmNlIGRhdGEgY2FuIGJlIGZyb20gYnVsayBSTkEgc2VxdWVuY2luZyBvciBmcm9tIG90aGVyIHNpbmdsZS1jZWxsIGV4cGVyaW1lbnRzLgpgU2luZ2xlUmAgaXMgYWxzbyBmYWlybHkgcm9idXN0IHRvIHRoZSBtZXRob2QgdXNlZCBmb3IgZ2VuZSBleHByZXNzaW9uIHF1YW50aWZpY2F0aW9uLCB3aGljaCBtZWFucyB0aGF0IHdlIGNhbiB1c2UgZWl0aGVyIFJOQS1zZXEgZGF0YXNldHMgb3IgbWljcm9hcnJheXMsIGlmIHRob3NlIGFyZSBtb3JlIHJlYWRpbHkgYXZhaWxhYmxlLiAKCk9uZSBjb252ZW5pZW50IHNvdXJjZSBvZiBjZWxsIHJlZmVyZW5jZSBkYXRhIGlzIHRoZSBgY2VsbGRleGAgcGFja2FnZSwgd2hpY2ggaXMgd2hhdCB3ZSB3aWxsIHVzZSBoZXJlLgpUaGlzIHBhY2thZ2UgaW5jbHVkZXMgZnVuY3Rpb25zIHRvIGRvd25sb2FkIGEgdmFyaWV0eSBvZiB3ZWxsLWFubm90YXRlZCByZWZlcmVuY2UgZGF0YXNldHMgaW4gYSBjb21tb24gZm9ybWF0LiAgCkZvciBtb3JlIGluZm9ybWF0aW9uIG9uIHRoZSBkYXRhc2V0cyBhdmFpbGFibGUsIHlvdSB3aWxsIHdhbnQgdG8gcmVmZXIgdG8gW3RoZSBgY2VsbGRleGAgc3VtbWFyeSB2aWduZXR0ZV0oaHR0cHM6Ly9iaW9jb25kdWN0b3Iub3JnL3BhY2thZ2VzLzMuMTYvZGF0YS9leHBlcmltZW50L3ZpZ25ldHRlcy9jZWxsZGV4L2luc3QvZG9jL3VzZXJndWlkZS5odG1sKS4KCldlIHdpbGwgc3RhcnQgYnkgdXNpbmcgYSByZWZlcmVuY2UgZGF0YXNldCBvZiBzb3J0ZWQgaW1tdW5lIGNlbGxzIGZyb20gW0dTRTEwNzAxMSAoTW9uYWNvIF9ldCBhbC5fIDIwMTkpXShodHRwczovL3d3dy5uY2JpLm5sbS5uaWguZ292L2dlby9xdWVyeS9hY2MuY2dpP2FjYz1HU0UxMDcwMTEpLgpUaGlzIHBhcnRpY3VsYXIgcmVmZXJlbmNlIHdhcyBjaG9zZW4gYmVjYXVzZSBpdCBpcyB3ZWxsLXN1aXRlZCB0byBQQk1DIGRhdGFzZXRzLCB3aXRoIGEgZ29vZCBsZXZlbCBvZiBncmFudWxhcml0eS4KClRoZSBgY2VsbGRleGAgZnVuY3Rpb25zIGFsc28gaGF2ZSBhIGNvbnZlbmllbnQgb3B0aW9uIHRvIGNvbnZlcnQgZ2VuZSBzeW1ib2xzIHRvIEVuc2VtYmwgaWRzLCB3aGljaCB3ZSB3aWxsIHVzZSBoZXJlIHNvIHRoYXQgb3VyIHJlZmVyZW5jZSBkYXRhIHVzZXMgdGhlIHNhbWUgZ2VuZSBpZGVudGlmaWVycyBhcyB0aGUgc2luZ2xlLWNlbGwgZGF0YS4KCmBgYHtyIGdldCBtb25hY299CiMgQmlvY29uZHVjdG9yICJIdWIiIHBhY2thZ2VzIHByb3ZpZGUgdGhlIG9wdGlvbiB0byBjYWNoZQojICAgZG93bmxvYWRzLCBidXQgdGhlIGludGVyYWN0aXZlIHByb21wdCBjYW4gYmUgYW5ub3lpbmcKIyAgIHdoZW4gd29ya2luZyB3aXRoIG5vdGVib29rcy4KIyBUaGVzZSBvcHRpb25zIGRpc2FibGUgdGhlIHByb21wdCBieSBnaXZpbmcgcGVybWlzc2lvbiAKIyAgIHRvIGNyZWF0ZSB0aGUgY2FjaGUgYXV0b21hdGljYWxseQpFeHBlcmltZW50SHViOjpzZXRFeHBlcmltZW50SHViT3B0aW9uKCJBU0siLCBGQUxTRSkKQW5ub3RhdGlvbkh1Yjo6c2V0QW5ub3RhdGlvbkh1Yk9wdGlvbigiQVNLIiwgRkFMU0UpCgojIEdldCBNb25hY28gMjAxOSBkYXRhIGZyb20gY2VsbGRleCB3aXRoIEVuc2VtYmwgaWRzLgptb25hY29fcmVmIDwtIGNlbGxkZXg6Ok1vbmFjb0ltbXVuZURhdGEoZW5zZW1ibCA9IFRSVUUpCmBgYAoKV2hhdCBpcyB0aGlzIGBtb25hY29fcmVmYCBvYmplY3Q/CgpgYGB7ciBleHBsb3JlIHJlZiwgbGl2ZSA9IFRSVUV9Cm1vbmFjb19yZWYKYGBgCgpBIGBTdW1tYXJpemVkRXhwZXJpbWVudGAgaXMgdmVyeSBzaW1pbGFyIHRvIGEgYFNpbmdsZUNlbGxFeHBlcmltZW50YCwgZXhjZXB0IHJhdGhlciB0aGFuIGhhdmluZyBvbmUgY29sdW1uIHBlciBjZWxsLCBlYWNoIGNvbHVtbiBpcyBhICpzYW1wbGUqLgpPdGhlcndpc2UsIHRoZSBjb21wb25lbnRzIGFyZSB2ZXJ5IHNpbWlsYXI6IGVhY2ggcm93IGlzIHN0aWxsIGEgZ2VuZSwgZm9yIGV4YW1wbGUsIGFuZCBhZGRpdGlvbmFsIGRhdGEgYWJvdXQgdGhlIHNhbXBsZXMgYXJlIHN0b3JlZCBpbiB0aGUgYGNvbERhdGFgLgpJbiBmYWN0LCB0aGUgYFNpbmdsZUNlbGxFeHBlcmltZW50YCBvYmplY3QgaXMgZGVyaXZlZCBmcm9tIGEgYFN1bW1hcml6ZWRFeHBlcmltZW50YCwgd2l0aCBzb21lIGV4dHJhIHNsb3RzIHRoYXQgYXJlIG1vcmUgcmVsZXZhbnQgdG8gc2luZ2xlLWNlbGwgZGF0YS4KCldoYXQgaW5mb3JtYXRpb24gZG8gd2UgaGF2ZSBmb3IgdGhlIHNhbXBsZXM/CgpgYGB7ciBleHBsb3JlIHJlZmVyZW5jZSBzYW1wbGUgZGF0YSwgbGl2ZT1UUlVFfQpjb2xEYXRhKG1vbmFjb19yZWYpCmBgYAoKVGhlcmUgYXJlIHRocmVlIG1haW4gY29sdW1ucyBmb3IgdGhlIHNhbXBsZSBkYXRhOgoKLSBgbGFiZWwubWFpbmAgaXMgYSBtb3JlIGdlbmVyYWwgY2VsbCB0eXBlIGFzc2lnbm1lbnQuICAKCi0gYGxhYmVsLmZpbmVgIGlzIGEgZmluZS1sZXZlbCBjZWxsIHR5cGUgd2l0aCBtb3JlIHNwZWNpZmljIGxhYmVscy4KVGhlIGV4YWN0IGxldmVsIG9mIGdyYW51bGFyaXR5IG9mIHRoZXNlIGBtYWluYCBhbmQgYGZpbmVgIGRlc2lnbmF0aW9ucyAoYW5kIGluZGVlZCB0aGUgbGFiZWwgbmFtZXMgdGhlbXNlbHZlcykgd2lsbCB2YXJ5IGFtb25nIGRhdGFzZXRzLCBzbyBpdCBpcyBpbXBvcnRhbnQgdG8gbG9vayBhdCB0aGUgcmVmZXJlbmNlIHRvIHNlZSB3aGV0aGVyIGl0IGlzIHN1aXRhYmxlIGZvciB5b3VyIGFwcGxpY2F0aW9uLgoKLSBgbGFiZWwub250YCBpcyBhIHN0YW5kYXJkaXplZCBbQ2VsbCBPbnRvbG9neV0oaHR0cHM6Ly93d3cuZWJpLmFjLnVrL29scy9vbnRvbG9naWVzL2NsKSBpZGVudGlmaWVyLiAKVXNpbmcgdGhlIGNlbGwgb250b2xvZ3kgY2FuIGFsbG93IGZvciBtb3JlIGNvbXBsZXggcmVwcmVzZW50YXRpb25zIG9mIHRoZSByZWxhdGlvbnNoaXBzIGFtb25nIGRpZmZlcmVudCBjZWxsIHR5cGVzLCBidXQgaW52ZXN0aWdhdGluZyB0aGF0IGlzIGJleW9uZCB0aGUgc2NvcGUgb2YgdGhpcyB3b3Jrc2hvcC4gCgpBbm90aGVyIGNvbXBvbmVudCB3ZSB3b3VsZCBsaWtlIHRvIGV4cGxvcmUgaXMgaG93IG1hbnkgb2YgZWFjaCBvZiB0aGVzZSBjZWxsIHR5cGVzIHdlIGhhdmUgaW4gdGhlIHJlZmVyZW5jZSBkYXRhc2V0LiAKQSBiaXQgb2YgcXVpY2sgYGRwbHlyYCB3cmFuZ2xpbmcgY2FuIGdpdmUgdXMgdGhlIGFuc3dlci4KCmBgYHtyIGNvdW50IGNlbGwgdHlwZXN9CmNvbERhdGEobW9uYWNvX3JlZikgfD4gCiAgYXMuZGF0YS5mcmFtZSgpIHw+CiAgZHBseXI6OmNvdW50KGxhYmVsLm1haW4sIGxhYmVsLmZpbmUpCmBgYAoKVGhpcyBpcyBwcmV0dHkgZ29vZCEgCk1vc3QgY2VsbCB0eXBlcyBoYXZlIDQgcmVwbGljYXRlcywgd2hpY2ggaXMgbW9yZSByZXBsaWNhdGVzIHRoYW4gd2Ugb2Z0ZW4gZmluZC4KCiMjIyBXaGF0IGRvZXMgYFNpbmdsZVJgIGRvPwoKQXMgbWVudGlvbmVkIGVhcmxpZXIsIGBTaW5nbGVSYCBidWlsZHMgYSBtb2RlbCBmcm9tIGEgc2V0IG9mIHRyYWluaW5nIGRhdGEsIGFuZCB0aGVuIHVzZXMgdGhhdCBtb2RlbCB0byBjbGFzc2lmeSBjZWxscyAob3IgZ3JvdXBzIG9mIGNlbGxzKSBpbiBuZXcgZGF0YXNldHMuCgpgU2luZ2xlUmAgd29ya3MgYnkgZmlyc3QgaWRlbnRpZnlpbmcgYSBzZXQgb2YgbWFya2VyIGdlbmVzIHRoYXQgY2FuIGJlIHVzZWQgdG8gZGlmZmVyZW50aWF0ZSBhbW9uZyB0aGUgY2VsbCB0eXBlcyBpbiB0aGUgcmVmZXJlbmNlIGRhdGFzZXQuIApJdCBkb2VzIHRoaXMgYnkgcGVyZm9ybWluZyBwYWlyd2lzZSBjb21wYXJpc29ucyBhbW9uZyBhbGwgb2YgdGhlIGNlbGwgdHlwZXMsIGFuZCByZXRhaW5pbmcgdGhlIHRvcCBzZXQgb2YgZ2VuZXMgZGlmZmVyZW50aWF0aW5nIGVhY2ggcGFpci4KVGhlIGlkZWEgaXMgdGhhdCB0aGlzIHNldCBvZiBnZW5lcyB3aWxsIGJlIHRoZSBtb3N0IGluZm9ybWF0aXZlIGZvciBkaWZmZXJlbnRpYXRpbmcgY2VsbCB0eXBlcy4KClRoZW4sIGZvciBlYWNoIGNlbGwsIGBTaW5nbGVSYCBjYWxjdWxhdGVzIHRoZSBTcGVhcm1hbiBjb3JyZWxhdGlvbiBiZXR3ZWVuIGV4cHJlc3Npb24gb2YgdGhhdCBjZWxsIGFuZCBlYWNoIGNlbGwgdHlwZSAodXNpbmcgdGhlIG9ubHkgdGhlIGdlbmVzIGNob3NlbiBlYXJsaWVyKS4KTm90YWJseSwgdGhpcyBpcyBhIG5vbi1wYXJhbWV0cmljIGNvcnJlbGF0aW9uLCBzbyB0aGUgc2NhbGluZyBhbmQgbm9ybWFsaXphdGlvbiB0aGF0IHdlIGFwcGx5IChvciBkb24ndCkgc2hvdWxkIG5vdCBtYXR0ZXIhCk5vdGUgdGhhdCBpZiB5b3UgdXNlZCBhIHNpbmdsZS1jZWxsIHRlY2hub2xvZ3kgdGhhdCBwcm9kdWNlcyBmdWxsLWxlbmd0aCB0cmFuc2NyaXB0cyAoaS5lLiwgU01BUlQtc2VxKSwgeW91IHdpbGwgcHJvYmFibHkgd2FudCB0byBjb252ZXJ0IHlvdXIgY291bnRzIHRvIFRyYW5zY3JpcHRzIHBlciBNaWxsaW9uIChUUE0pLCB0byBhbGxvdyBtb3JlIGNvbnNpc3RlbnQgcmFua2luZyBhbW9uZyB0cmFuc2NyaXB0cyBvZiBkaWZmZXJlbnQgbGVuZ3Rocy4gCgpUaGUgcmVmZXJlbmNlIGNlbGwgdHlwZSB3aXRoIHRoZSBoaWdoZXN0IGNvcnJlbGF0aW9uIGlzIHRoZW4gY2hvc2VuIGFzIHRoZSBjZWxsIHR5cGUgYXNzaWdubWVudCBmb3IgdGhhdCBjZWxsLgpJZiB0aGVyZSBhcmUgbXVsdGlwbGUgY2VsbCB0eXBlcyB3aXRoIGhpZ2ggc2NvcmVzLCBhbiBvcHRpb25hbCBmaW5lLXR1bmluZyBzdGVwIHJlcGVhdHMgdGhlIHByb2Nlc3MgdXNpbmcgb25seSB0aGUgbW9zdCByZWxldmFudCBnZW5lcyBmb3IgdGhvc2UgY2VsbCB0eXBlcy4KCgojIyMgUnVubmluZyBgU2luZ2xlUmAKCkZvciBvdXIgZmlyc3QgcnVuLCB3ZSB3aWxsIGRvIHRoZSBtYXJrZXIgZ2VuZSBzZWxlY3Rpb24gKHRyYWluaW5nKSBhbmQgY2xhc3NpZmljYXRpb24gaW4gYSBzaW5nbGUgc3RlcCwgdXNpbmcgdGhlIGNvbnZlbmllbmNlIGZ1bmN0aW9uIGBTaW5nbGVSOjpTaW5nbGVSKClgLgpGb3IgdGhpcyB3ZSBuZWVkIG9ubHkgc3VwcGx5IHRocmVlIG1haW4gYXJndW1lbnRzOiBPdXIgU0NFIG9iamVjdCwgYSByZWZlcmVuY2UgbWF0cml4IChoZXJlIGluIGBTdW1tYXJpemVkRXhwZXJpbWVudGAgZm9ybWF0KSwgYW5kIHRoZSBsYWJlbHMgZm9yIGVhY2ggb2YgdGhlIHNhbXBsZXMgaW4gdGhlIHJlZmVyZW5jZSB0aGF0IHdlIHdhbnQgdG8gdXNlLgpXZSBhbHNvIG5lZWQgdG8gYmUgc3VyZSB0aGF0IG91ciBzYW1wbGUgYW5kIHRoZSByZWZlcmVuY2UgZGF0YSB1c2UgdGhlIHNhbWUgZ2VuZSBJRHMsIHdoaWNoIGlzIHdoeSB3ZSByZXF1ZXN0ZWQgdGhlIEVuc2VtYmwgSURzIHdoZW4gZ2V0dGluZyB0aGUgcmVmZXJlbmNlIGRhdGFzZXQuCgpCZWNhdXNlIHRoaXMgZnVuY3Rpb24gaXMgZG9pbmcgbWFueSByZXBldGl0aXZlIGNhbGN1bGF0aW9ucyAobG90cyBvZiBjb3JyZWxhdGlvbnMhKSwgd2UgY2FuIHNwZWVkIGl0IHVwIGJ5IGluY2x1ZGluZyB0aGUgYEJQUEFSQU1gIGFyZ3VtZW50LgpUaGlzIGlzIGEgY29tbW9uIGFyZ3VtZW50IGluIGBCaW9jb25kdWN0b3JgIHBhY2thZ2VzIHdoZXJlIGBCUGAgc3RhbmRzIGZvciB0aGUgYEJpb2NQYXJhbGxlbGAgcGFja2FnZSwgd2hpY2ggcHJvdmlkZXMgbXVsdGlwcm9jZXNzaW5nIGNhcGFiaWxpdGllcyB0byBtYW55IEJpb2NvbmR1Y3RvciBmdW5jdGlvbnMuIApJbiB0aGlzIGNhc2UsIHdlIHdpbGwgdXNlIHRoZSBhcmd1bWVudCBgQmlvY1BhcmFsbGVsOjpNdWx0aWNvcmVQYXJhbSg0KWAgdG8gc3BlY2lmeSB3ZSB3YW50IHRvIHVzZSBsb2NhbCBtdWx0aWNvcmUgcHJvY2Vzc2luZyB3aXRoIDQgIndvcmtlcnMiLgoKYGBge3Igc2ltcGxlIFNpbmdsZVIsIGxpdmU9VFJVRX0KIyBjYWxjdWxhdGUgU2luZ2xlUiByZXN1bHRzIGluIG9uZSBzdGVwCnNpbmdsZXJfcmVzdWx0IDwtIFNpbmdsZVI6OlNpbmdsZVIoCiAgc2NlLCAjIG91ciBxdWVyeSBTQ0UKICByZWYgPSBtb25hY29fcmVmLCAjIHJlZmVyZW5jZSBkYXRhc2V0CiAgbGFiZWxzID0gbW9uYWNvX3JlZiRsYWJlbC5tYWluLCAjIHJlZmVyZW5jZSBsYWJlbHMgdG8gdXNlCiAgQlBQQVJBTSA9IEJpb2NQYXJhbGxlbDo6TXVsdGljb3JlUGFyYW0oNCkgIyBtdWx0aXByb2Nlc3NpbmcKKQpgYGAKCmBTaW5nbGVSYCBwcm92aWRlcyBhIGZldyBuaWNlIHZpc3VhbGl6YXRpb25zIGZvciBldmFsdWF0aW5nIHRoZSBzdGF0aXN0aWNzIGl0IGNhbGN1bGF0ZWQgYW5kIHRoZSBhc3NpZ25tZW50cyBpdCBtYWtlcy4KT25lIGlzIGEgaGVhdG1hcCBvZiB0aGUgc2NvcmVzIGZvciBlYWNoIGNlbGwsIGFycmFuZ2VkIGJ5IHRoZSBjZWxsIHR5cGUgdGhhdCB3YXMgYXNzaWduZWQgdG8gZWFjaC4KVGhpcyBpcyBjcmVhdGVkIHdpdGggdGhlIGBTaW5nbGVSOjpwbG90U2NvcmVIZWF0bWFwKClgIGZ1bmN0aW9uLgoKYGBge3IgcGxvdCBTaW5nbGVSIGhlYXRtYXAsIGxpdmUgPSBUUlVFfQpTaW5nbGVSOjpwbG90U2NvcmVIZWF0bWFwKHNpbmdsZXJfcmVzdWx0KQpgYGAKV2UgY2FuIGFsc28gcHVsbCBvdXQgaW5kaXZpZHVhbCBjb21wb25lbnRzIG9mIHRoZSByZXN1bHRzIG9iamVjdCBmb3IgcGxvdHRpbmcgaW4gdGhlIGNvbnRleHQgb2Ygb3VyIGlucHV0IFNDRSBvYmplY3QuCkhlcmUgd2Ugd2lsbCBzYXZlIHRoZSBwcnVuZWQgbGFiZWxzICh3aGVyZSBsb3ctcXVhbGl0eSBhc3NpZ25tZW50cyBoYXZlIGJlZW4gZ2l2ZW4gYW4gYE5BYCBsYWJlbCksIHN0b3JpbmcgdGhlbSBiYWNrIGluIG91ciBTQ0Ugb2JqZWN0IChzcGVjaWZpY2FsbHkgdG8gYSBuZXcgY29sdW1uIG9mIHRoZSBgY29sRGF0YWAgdGFibGUpLgoKYGBge3Igc2F2ZSBjZWxsdHlwZXMsIGxpdmU9VFJVRX0Kc2NlJGNlbGx0eXBlX21haW4gPC0gc2luZ2xlcl9yZXN1bHQkcHJ1bmVkLmxhYmVscwpgYGAKCk5vdyB3ZSBjYW4gcGxvdCB0aGUgY2VsbCB0eXBlIGFzc2lnbm1lbnRzIG9udG8gb3VyIFVNQVAgdG8gc2VlIGhvdyB0aGV5IGNvbXBhcmUgdG8gdGhlIHBhdHRlcm5zIHdlIHNhdyB0aGVyZSBiZWZvcmUuCgpgYGB7ciBwbG90IGNlbGx0eXBlIHVtYXAsIGxpdmU9VFJVRX0Kc2NhdGVyOjpwbG90VU1BUChzY2UsIGNvbG9yX2J5ID0gImNlbGx0eXBlX21haW4iKSAKYGBgCkFubm95aW5nbHksIHRoZSBgTkFgIGFuZCBgVCBjZWxsc2AgbGFiZWxzIGFyZSBxdWl0ZSBjbG9zZSBpbiBjb2xvciwgYW5kIHRoZSBgc2NhdGVyYCBhbmQgYFNpbmdsZVJgIHBhY2thZ2VzIGRvbid0IGFncmVlIG9uIGNvbG9yIGNob2ljZXMuCkx1Y2tpbHksIHNpbmNlIGBwbG90VU1BUCgpYCByZXR1cm5zIGEgYGdncGxvdGAgb2JqZWN0LCB3ZSBjYW4gbW9kaWZ5IHRoZSBjb2xvciBwYWxldHRlIHVzaW5nIGBnZ3Bsb3QyYCBmdW5jdGlvbnMuClN0aWxsIGFubm95aW5nbHksIGhvd2V2ZXIsIHdoZW4gd2UgY2hhbmdlIHRoZSBwYWxldHRlLCB0aGUgbGVnZW5kIHRpdGxlIGRlZmF1bHRzIHRvIHRoZSB1bmluZm9ybWF0aXZlIG5hbWUgYCJjb2xvdXJfYnkiYCwgc28gd2UnbGwgYWxzbyBzcGVjaWZ5IGEgbWF0Y2hpbmcgbGVnZW5kIHRpdGxlIHdpdGggb3VyIG5ldyBjb2xvciBwYWxldHRlLgoKYGBge3IgcGxvdCBjZWxsdHlwZSB1bWFwIHBhbGV0dGV9CnNjYXRlcjo6cGxvdFVNQVAoc2NlLCBjb2xvcl9ieSA9ICJjZWxsdHlwZV9tYWluIikgKwogIHNjYWxlX2NvbG9yX2JyZXdlcihuYW1lID0gIkNlbGwgdHlwZSIsICMgbGVnZW5kIHRpdGxlCiAgICAgICAgICAgICAgICAgICAgIHBhbGV0dGUgPSAiRGFyazIiLCAgICAgICMgY29sb3IgcGFsZXR0ZQogICAgICAgICAgICAgICAgICAgICBuYS52YWx1ZSA9ICJncmF5ODAiKSAgICAjIHVzZSBsaWdodCBncmF5IGZvciBOQSB2YWx1ZXMKYGBgCgpXZSBzZWVtIHRvIGhhdmUgYSBwcmV0dHkgZ29vZCBzZXQgb2YgY2VsbCB0eXBlIGFzc2lnbm1lbnRzLCB3aXRoIG1vc3QgZmFsbGluZyBpbnRvIGdyb3VwaW5ncyBjb25zaXN0ZW50IHdpdGggd2hhdCB3ZSBzZWUgaW4gdGhlIFVNQVAgcGxvdC4KCldlIGNhbiB0aGFuayB0aGUgZmFjdCB0aGF0IHRoaXMgaXMgYSBQQk1DIHNhbXBsZSBhbmQgdGhhdCB3ZSBoYXZlIGEgZ29vZCByZWZlcmVuY2UgZGF0YXNldCBmb3IgdGhlc2UgY2VsbCB0eXBlcyBmb3IgdGhlIGNsZWFubGluZXNzIG9mIHRoaXMgcGxvdC4KUXVpdGUgb2Z0ZW4gd2l0aCBvdGhlciBraW5kcyBvZiBzYW1wbGVzIChlc3BlY2lhbGx5IGNhbmNlciBjZWxscyEpIHRoaW5ncyB3aWxsIGJlIG11Y2ggbGVzcyBjbGVhbiEKCldlIGNhbiBhbHNvIGxvb2sgdG8gc2VlIGhvdyB0aGUgY2VsbCB0eXBlIGFzc2lnbm1lbnRzIGFyZSBkaXN0cmlidXRlZCB1c2luZyB0aGUgYmFzZSBSIGZ1bmN0aW9uIGB0YWJsZSgpYC4KU2luY2Ugd2UgbGlrZSB0byBrZWVwIHRyYWNrIG9mIHRoZSBjZWxscyB0aGF0IGVuZGVkIHVwIGFzIGBOQWAgaW4gdGhlIHBydW5lZCBsYWJlbHMsIHdlIHdpbGwgaW5jbHVkZSB0aGUgYHVzZU5BID0gImlmYW55ImAgYXJndW1lbnQuCgpgYGB7ciBjZWxsIHR5cGUgdGFibGV9CnRhYmxlKHNpbmdsZXJfcmVzdWx0JHBydW5lZC5sYWJlbHMsIHVzZU5BID0gImlmYW55IikKYGBgCgojIyMgRXhwbG9yaW5nIGZpbmVyIGxhYmVscwoKSW4gdGhlIHByZXZpb3VzIGNlbGwgdHlwaW5nLCB3ZSB1c2VkIHRoZSBgbGFiZWwubWFpbmAgY29sdW1uLCBidXQgd2UgYWxzbyBoYWQgYGxhYmVsLmZpbmVgLCBzbyBsZXQncyB1c2UgdGhhdCB0byBleHBsb3JlIHRoZSBkYXRhc2V0IGluIGEgYml0IG1vcmUgZGV0YWlsLiAKCldlIHdpbGwgYWxzbyB0YWtlIHRoaXMgdGltZSB0byBkaXZlIGEgYml0IGRlZXBlciBpbnRvIHRoZSBzdGVwcyB0aGF0IGBTaW5nbGVSYCBwZXJmb3JtZWQuIApBcyBtZW50aW9uZWQsIHRoZSBmaXJzdCBzdGVwIGlzIHRyYWluaW5nIHRoZSBtb2RlbCwgZHVyaW5nIHdoaWNoIHdlIGlkZW50aWZ5IHRoZSBnZW5lcyB0aGF0IHdpbGwgYmUgdXNlZCBmb3IgdGhlIGNvcnJlbGF0aW9uIGFuYWx5c2lzIGxhdGVyLgpXaGlsZSB0aGlzIHN0ZXAgaXMgbm90IHBhcnRpY3VsYXJseSBzbG93LCBpZiB3ZSB3ZXJlIGNsYXNzaWZ5aW5nIG11bHRpcGxlIHNhbXBsZXMsIHdlIHdvdWxkIG5vdCB3YW50IHRvIGhhdmUgdG8gcmVwZWF0IGl0IGZvciBldmVyeSBzYW1wbGUuCgpUbyBkbyB0aGUgdHJhaW5pbmcsIHdlIHdpbGwgdXNlIHRoZSBgdHJhaW5TaW5nbGVSKClgIGZ1bmN0aW9uLiAKRm9yIHRoaXMgd2Ugd2lsbCBzdGFydCB3aXRoIG91ciByZWZlcmVuY2UgYW5kIHRoZSBsYWJlbHMgd2Ugd2FudCB0byB0cmFpbiB0aGUgbW9kZWwgd2l0aC4KCldlIGNhbiB0aGVuIHNwZWNpZnkgdGhlIG1ldGhvZCB1c2VkIHRvIHNlbGVjdCB0aGUgZ2VuZXMgdGhhdCB3aWxsIGJlIHVzZWQgZm9yIGNsYXNzaWZpY2F0aW9uLgpUaGUgZGVmYXVsdCBtZXRob2QgaXMgYCJkZSJgLCB3aGljaCBwZXJmb3JtcyBhIGRpZmZlcmVudGlhbCBleHByZXNzaW9uIGFuYWx5c2lzIGZvciBlYWNoIHBhaXIgb2YgbGFiZWxzLCBidXQgd2UgY291bGQgYWxzbyB1c2UgYCJzZCJgIHRvIHNlbGVjdCB0aGUgZ2VuZXMgd2hpY2ggYXJlIG1vc3QgdmFyaWFibGUgYWNyb3NzIGxhYmVscywgb3IgYCJhbGwiYCB0byB1c2UgYWxsIGdlbmVzLgpJZiB3ZSB3YW50IHRvIGdldCByZWFsbHkgZmFuY3ksIHdlIGNvdWxkIGV2ZW4gcHJvdmlkZSBhIHNwZWNpZmljIGxpc3Qgb2YgZ2VuZXMgdG8gdXNlLgoKV2Ugc2hvdWxkIG5vdGUgaGVyZSB0aGF0IHRoZSByZWZlcmVuY2UgZGF0YXNldCBmb3IgYFNpbmdsZVJgIGRvZXMgbm90IG5lZWQgdG8gYmUgZnJvbSBhIGNvbXBlbmRpdW0gbGlrZSBgY2VsbGRleGAhCklmIHlvdSBoYXZlIGFueSB3ZWxsLWNsYXNzaWZpZWQgZGF0YXNldCB0aGF0IHlvdSB3YW50IHRvIHVzZSBhcyBhIHJlZmVyZW5jZSwgeW91IGNhbiwgYXMgbG9uZyBhcyB5b3UgY2FuIGNyZWF0ZSBhIGdlbmUgYnkgc2FtcGxlIGV4cHJlc3Npb24gbWF0cml4IGFuZCBhIHZlY3RvciBvZiBjZWxsIHR5cGVzIGZvciBlYWNoIHNhbXBsZS4gCllvdSB3aWxsIHdhbnQgdG8gZW5zdXJlIHRoYXQgdGhlIGNlbGwgdHlwZXMgeW91IGV4cGVjdCB0byBzZWUgaW4geW91ciBzYW1wbGUgYXJlIHByZXNlbnQgaW4gdGhlIHJlZmVyZW5jZSBkYXRhc2V0LCBhbmQgZGF0YSBzaG91bGQgYmUgbm9ybWFsaXplZCwgYnV0IG90aGVyd2lzZSB0aGUgbWV0aG9kIGNhbiBiZSBxdWl0ZSBmbGV4aWJsZS4gCllvdSBjYW4gZXZlbiB1c2UgYSBwcmV2aW91c2x5LWFubm90YXRlZCBgU2luZ2xlQ2VsbEV4cGVyaW1lbnRgIGFzIGEgcmVmZXJlbmNlIGZvciBhIG5ldyBkYXRhc2V0LgpGb3IgbW9yZSBkZXRhaWxzIGFib3V0IGN1c3RvbSByZWZlcmVuY2VzLCBzZWUgdGhlIFtPU0NBIGNoYXB0ZXIgb24gY2VsbCB0eXBlIGFubm90YXRpb25dKGh0dHA6Ly9iaW9jb25kdWN0b3Iub3JnL2Jvb2tzLzMuMTYvT1NDQS5iYXNpYy9jZWxsLXR5cGUtYW5ub3RhdGlvbi5odG1sI3VzaW5nLWN1c3RvbS1yZWZlcmVuY2VzKQoKV2UgZG8gd2FudCB0byBiZSBzdXJlIHRoYXQgdGhlIGdlbmVzIHNlbGVjdGVkIGZvciB0aGUgbW9kZWwgd2lsbCBiZSBhbW9uZyB0aG9zZSBwcmVzZW50IGluIG91ciBTQ0Ugb2JqZWN0LCBzbyB3ZSB3aWxsIHVzZSB0aGUgYHJlc3RyaWN0YCBhcmd1bWVudCB3aXRoIGEgdmVjdG9yIG9mIHRoZSBnZW5lcyBpbiBvdXIgU0NFLgpUaGlzIHN0ZXAgd291bGQgaGFwcGVuIGF1dG9tYXRpY2FsbHkgd2l0aCB0aGUgYFNpbmdsZVI6OlNpbmdsZVIoKWAgZnVuY3Rpb24sIGJ1dCB3ZSBuZWVkIHRvIGFkZCBpdCBtYW51YWxseSBmb3IgdGhpcyB1c2UgY2FzZS4KCgpgYGB7ciB0cmFpbiBmaW5lbW9kZWwsIGxpdmU9VFJVRX0KIyBidWlsZCBmaW5lIG1vZGVsCnNpbmdsZXJfZmluZW1vZGVsIDwtIFNpbmdsZVI6OnRyYWluU2luZ2xlUigKICBtb25hY29fcmVmLCAjIHJlZmVyZW5jZSBkYXRhc2V0CiAgbGFiZWxzID0gbW9uYWNvX3JlZiRsYWJlbC5maW5lLCAjIGxhYmVscyBmb3IgdHJhaW5pbmcgZGF0YXNldAogICMgdXNlIERFIHRvIHNlbGVjdCBnZW5lcyAoZGVmYXVsdCkKICBnZW5lcyA9ICJkZSIsIAogICMgb25seSB1c2UgZ2VuZXMgaW4gdGhlIHNjZSBvYmplY3QKICByZXN0cmljdCA9IHJvd25hbWVzKHNjZSksCiAgIyBwYXJhbGxlbCBwcm9jZXNzaW5nCiAgQlBQQVJBTSA9IEJpb2NQYXJhbGxlbDo6TXVsdGljb3JlUGFyYW0oNCkKKQpgYGAKCk5vdyB3ZSBjYW4gcGVyZm9ybSB0aGUgY2xhc3NpZmljYXRpb24gc3RlcCwgdXNpbmcgb3VyIFNDRSBvYmplY3QgYW5kIHRoZSBgU2luZ2xlUmAgbW9kZWwgdGhhdCB3ZSBqdXN0IGNyZWF0ZWQuCgpgYGB7ciBjbGFzc2lmeSBmaW5lLCBsaXZlPVRSVUV9CiMgY2xhc3NpZnkgd2l0aCBmaW5lIG1vZGVsCnNpbmdsZXJfcmVzdWx0X2ZpbmUgPC0gU2luZ2xlUjo6Y2xhc3NpZnlTaW5nbGVSKAogIHNjZSwgIyBvdXIgU0NFIG9iamVjdAogIHNpbmdsZXJfZmluZW1vZGVsLCAjIHRoZSB0cmFpbmVkIG1vZGVsIG9iamVjdAogICMgcGVyZm9ybSBmaW5lIHR1bmluZyAoZGVmYXVsdCkKICBmaW5lLnR1bmUgPSBUUlVFLAogICMgcGFyYWxsZWwgcHJvY2Vzc2luZwogIEJQUEFSQU0gPSBCaW9jUGFyYWxsZWw6Ok11bHRpY29yZVBhcmFtKDQpCikKYGBgCgoKV2hhdCBsYWJlbHMgd2VyZSBhc3NpZ25lZCwgYW5kIGhvdyBtYW55IG9mIGVhY2g/CgpgYGB7ciB0YWJsZSBmaW5lIGxhYmVsc30KdGFibGUoc2luZ2xlcl9yZXN1bHRfZmluZSRwcnVuZWQubGFiZWxzLCB1c2VOQSA9ICJpZmFueSIpCmBgYAoKYGBge3IgcGxvdCB1bWFwIGZpbmUsIGxpdmU9VFJVRX0KIyBhZGQgZmluZSBsYWJlbHMgdG8gU0NFCnNjZSRjZWxsdHlwZV9maW5lIDwtIHNpbmdsZXJfcmVzdWx0X2ZpbmUkcHJ1bmVkLmxhYmVscwojIHBsb3QgVU1BUCB3aXRoIGZpbmUgbGFiZWxzCnNjYXRlcjo6cGxvdFVNQVAoc2NlLCBjb2xvcl9ieSA9ICJjZWxsdHlwZV9maW5lIikKYGBgCgpUaGF0J3MgYSBwcmV0dHkgbWVzc3kgcGxvdC4KTW9zdGx5IHRoYXQgaXMgYmVjYXVzZSB0aGVyZSBhcmUgX2xvdHNfIG9mIGNlbGwgdHlwZXMgaGVyZSwgYW5kIG5vdCBlbm91Z2ggY29sb3JzIHRvIHJlcHJlc2VudCB0aGVtIGFsbC4KVGhlIGBOQWAgY2VsbHMgYWxzbyBnb3QgdGFrZW4gb2ZmIGNvbXBsZXRlbHksIHdoaWNoIGlzIG5vdCBpZGVhbC4KCk9uZSB0aGluZyB3ZSBjYW4gZG8gaXMgdG8gdXNlIHNvbWUgZnVuY3Rpb25zIGZyb20gdGhlIGB0aWR5dmVyc2VgIHBhY2thZ2UgW2Bmb3JjYXRzYF0oaHR0cHM6Ly9mb3JjYXRzLnRpZHl2ZXJzZS5vcmcpLCB3aGljaCBjYW4gYmUgdmVyeSBoYW5keSBmb3IgZGVhbGluZyB3aXRoIGNhdGVnb3JpY2FsIHZhcmlhYmxlcyBsaWtlIHRoZXNlIGNlbGwgdHlwZXMuCgpXZSB3aWxsIHVzZSB0d28gb2YgdGhlc2UgZnVuY3Rpb25zIGluIHRoZSBjaHVuayBiZWxvdzogCkZpcnN0IHdlIHdpbGwgdXNlIGBmY3RfY29sbGFwc2VgIHRvIHRha2Ugc29tZSBvZiB0aGUgZmluZXIgbGFiZWxzIHRoYXQgd2UgbWlnaHQgbm90IGJlIGFzIGludGVyZXN0ZWQgaW4gYW5kIGNvbGxhcHNlIHRoZW0gaW50byBsb2dpY2FsIGdyb3VwaW5ncyAoaW4gdGhpcyBjYXNlLCB0aGUgYG1haW5gIGxhYmVsIHRoYXQgdGhleSB3ZXJlIHBhcnQgb2YpLgpBZnRlciB0aGF0LCB3ZSB3aWxsIHVzZSBgZmN0X3JlbGV2ZWxgIHRvIHB1dCB0aGUgcmVtYWluaW5nIGZhY3RvciBsZXZlbHMgaW4gdGhlIG9yZGVyIHdlIHdvdWxkIGxpa2UgdGhlbSB0byBhcHBlYXIgZm9yIHBsb3R0aW5nLgoKYGBge3IgY29sbGFwc2UgbGFiZWxzfQpjb2xsYXBzZWRfbGFiZWxzIDwtIHNpbmdsZXJfcmVzdWx0X2ZpbmUkcHJ1bmVkLmxhYmVscyB8PgogIGZvcmNhdHM6OmZjdF9jb2xsYXBzZSgKICAgICJNb25vY3l0ZXMiID0gYygKICAgICAgICAiQ2xhc3NpY2FsIG1vbm9jeXRlcyIsIAogICAgICAgICJJbnRlcm1lZGlhdGUgbW9ub2N5dGVzIiwgCQogICAgICAgICJOb24gY2xhc3NpY2FsIG1vbm9jeXRlcyIpLAogICAgIkRlbmRyaXRpYyBjZWxscyIgPSBjKAogICAgICAgICJNeWVsb2lkIGRlbmRyaXRpYyBjZWxscyIsCiAgICAgICAgIlBsYXNtYWN5dG9pZCBkZW5kcml0aWMgY2VsbHMiKSwKICAgICJUIGNlbGxzIiA9IGMoCiAgICAgICAgIk1BSVQgY2VsbHMiLAogICAgICAgICJOb24tVmQyIGdkIFQgY2VsbHMiLAogICAgICAgICJWZDIgZ2QgVCBjZWxscyIpLAogICAgIkhlbHBlciBUIGNlbGxzIiA9IGMoCiAgICAgICAgIlRoMSBjZWxscyIsCiAgICAgICAgIlRoMS9UaDE3IGNlbGxzIiwgCiAgICAgICAgIlRoMTcgY2VsbHMiLCAKICAgICAgICAiVGgyIGNlbGxzIiwKICAgICAgICAiRm9sbGljdWxhciBoZWxwZXIgVCBjZWxscyIpLAogICAgIkIgY2VsbHMiID0gYygKICAgICAgICAiTmFpdmUgQiBjZWxscyIsCiAgICAgICAgIlN3aXRjaGVkIG1lbW9yeSBCIGNlbGxzIiwKICAgICAgICAiTm9uLXN3aXRjaGVkIG1lbW9yeSBCIGNlbGxzIiwKICAgICAgICAiRXhoYXVzdGVkIEIgY2VsbHMiLAogICAgICAgICJQbGFzbWFibGFzdHMiCQkKICAgICkKICApIHw+CiAgIyBvcmRlciBmb3IgcGxvdHRpbmcKICBmb3JjYXRzOjpmY3RfcmVsZXZlbCgKICAgICJIZWxwZXIgVCBjZWxscyIsCiAgICAiVCByZWd1bGF0b3J5IGNlbGxzIiwKICAgICJOYWl2ZSBDRDQgVCBjZWxscyIsCiAgICAiVGVybWluYWwgZWZmZWN0b3IgQ0Q0IFQgY2VsbHMiLAogICAgIk5haXZlIENEOCBUIGNlbGxzIiwKICAgICJDZW50cmFsIG1lbW9yeSBDRDggVCBjZWxscyIsCiAgICAiRWZmZWN0b3IgbWVtb3J5IENEOCBUIGNlbGxzIiwKICAgICJUZXJtaW5hbCBlZmZlY3RvciBDRDggVCBjZWxscyIsCiAgICAiVCBjZWxscyIsCiAgICAiTmF0dXJhbCBraWxsZXIgY2VsbHMiLAogICAgIkIgY2VsbHMiLAogICAgIk1vbm9jeXRlcyIsCiAgICAiRGVuZHJpdGljIGNlbGxzIiwKICAgICJQcm9nZW5pdG9yIGNlbGxzIiwKICAgICJMb3ctZGVuc2l0eSBiYXNvcGhpbHMiCiAgKQpgYGAKCk5vdyB0aGF0IHdlIGhhdmUgdGhhdCBzZXQgdXAsIHdlIGNhbiBwbG90IHVzaW5nIG91ciBjb2xsYXBzZWQgYW5kIG9yZGVyZWQgY2VsbCB0eXBlIGxhYmVscy4KCmBgYHtyIHBsb3QgY29sbGFwc2VkLCBsaXZlPVRSVUV9CnNjZSRjZWxsdHlwZV9jb2xsYXBzZWQgPC0gY29sbGFwc2VkX2xhYmVscwpzY2F0ZXI6OnBsb3RVTUFQKHNjZSwgCiAgICAgICAgICAgICAgICAgY29sb3JfYnkgPSAiY2VsbHR5cGVfY29sbGFwc2VkIikKYGBgCgoKIyMjIEhlYXRtYXAgb2YgY2VsbCB0eXBlcyAmIGNsdXN0ZXJzCgpMZXQncyBsb29rIGF0IGhvdyB0aGUgY2VsbCB0eXBlIGFzc2lnbm1lbnRzIHdlIG9idGFpbmVkIHVzaW5nIGBTaW5nbGVSYCBjb21wYXJlIHRvIHRoZSBjbHVzdGVycyB0aGF0IHdlIGZvdW5kIHVzaW5nIHRoZSB1bnN1cGVydmlzZWQgY2x1c3RlcmluZyBhdCB0aGUgc3RhcnQgb2YgdGhpcyBub3RlYm9vay4KClRvIGRvIHRoaXMsIHdlIHdpbGwgYWdhaW4gdXNlIHRoZSBgdGFibGUoKWAgZnVuY3Rpb24sIGJ1dCBub3cgd2l0aCB0d28gdmVjdG9ycyBhcyBpbnB1dCwgdG8gYnVpbGQgYSBjb250aW5nZW5jeSB0YWJsZSBvZiB0aGUgY2VsbCB0eXBlcyBhbmQgY2x1c3RlcnMgdGhhdCBlYWNoIGNlbGwgd2FzIGNsYXNzaWZpZWQgd2l0aC4gCgpgYGB7ciB0eXBlIGNsdXN0ZXIgdGFibGV9CiMgY3JlYXRlIGEgdGFibGUgb2YgY2x1c3RlcnMgJiBjZWxsIHR5cGUgY291bnRzCnR5cGVfY2x1c3Rlcl90YWIgPC0gdGFibGUoc2NlJGNlbGx0eXBlX2ZpbmUsIHNjZSRubl9jbHVzdGVyLCB1c2VOQSA9ICJpZmFueSIpCgojIGxvb2sgYXQgdGhlIHRvcCBjb3JuZXIgb2YgdGhlIHJlc3VsdHMKdHlwZV9jbHVzdGVyX3RhYlsxOjUsIDE6NV0KYGBgCgpBcyB5b3UgY2FuIHNlZSwgdGhpcyBwcm9kdWNlZCBhIHRhYmxlIHdpdGggcm93cyBmb3IgZWFjaCBjZWxsIHR5cGUgYW5kIGNvbHVtbnMgZm9yIGVhY2ggY2x1c3RlciBudW1iZXIuClRoZSB2YWx1ZXMgYXJlIHRoZSBjb3VudCBvZiBjZWxscyBmb3IgZWFjaCBjbHVzdGVyL2NlbGwgdHlwZSBjb21iaW5hdGlvbi4KSG93ZXZlciwgdGhlc2UgcmF3IGNvdW50cyBhcmUgbm90IHF1aXRlIHdoYXQgd2UnbGwgd2FudCBmb3IgdmlzdWFsaXphdGlvbi4gClNpbmNlIHRoZSB0b3RhbCBudW1iZXIgb2YgY2VsbHMgZGlmZmVycyBhY3Jvc3MgY2x1c3RlcnMsIHdlJ2QgbGlrZSB0byBjb252ZXJ0IHRoZXNlIGNvdW50cyBpbnRvIHRoZSBfcHJvcG9ydGlvbnNfIG9mIGVhY2ggY2VsbCB0eXBlIGluIGVhY2ggY2x1c3Rlci4KCldlJ2xsIGRvIHRoaXMgYnkgZ29pbmcgdGhyb3VnaCB0aGUgdGFibGUgY29sdW1uIGJ5IGNvbHVtbiBhbmQgZGl2aWRpbmcgZWFjaCB2YWx1ZSBieSB0aGUgc3VtIGZvciB0aGF0IGNsdXN0ZXIuClRoaXMgd2lsbCBnaXZlIHVzIG5vcm1hbGl6ZWQgdmFsdWVzIHdoZXJlIHRoZSB2YWx1ZXMgaW4gZWFjaCBjb2x1bW4gbm93IHN1bSB0byAxLgpUbyBkbyB0aGF0LCB3ZSB3aWxsIHVzZSB0aGUgYGFwcGx5YCBmdW5jdGlvbiwgd2hpY2ggYWxsb3dzIHVzIHRvIG9wZXJhdGUgb24gYSBtYXRyaXggcm93IGJ5IHJvdyBvciBjb2x1bW4gYnkgY29sdW1uLCBhcHBseWluZyBhIGZ1bmN0aW9uIHRvIGVhY2ggInNsaWNlIi4KU2luY2UgdGhlIGZ1bmN0aW9uIHdlIHdhbnQgdG8gYXBwbHkgaXMgdmVyeSBzaG9ydCwgd2Ugd2lsbCB1c2UgUidzIG5ldyAoYXMgb2YgIHZlcnNpb24gNC4xKSBhbm9ueW1vdXMgZnVuY3Rpb24gc2hvcnRoYW5kOiAKYFwoeCkgLi4uYCBjYW4gYmUgdXNlZCB0byBkZWZpbmUgYSBmdW5jdGlvbiB0aGF0IHRoYXQgdGFrZXMgYXMgaW5wdXQgdmFsdWVzIGB4YCAod2hlcmUgdGhlIGAuLi5gIGlzIHdoZXJlIHlvdSB3b3VsZCBwdXQgdGhlIGV4cHJlc3Npb24gdG8gY2FsY3VsYXRlKS4KSGVyZSB3ZSB3aWxsIGFwcGx5IHRoZSBleHByZXNzaW9uIGB4L3N1bSh4KWAsIHdoaWNoIHdpbGwgZGl2aWRlIGVhY2ggZWxlbWVudCBvZiBhIHZlY3RvciBgeGAgYnkgdGhlIHN1bSBvZiBpdHMgdmFsdWVzLgoKYGBge3Igbm9ybWFsaXplIGJ5IGNvbHVtbn0KIyBub3JtYWxpemUgYnkgdGhlIG51bWJlciBvZiBjZWxscyBpbiBlYWNoIGNsdXN0ZXIgKGNvbHVtbnMpCnR5cGVfY2x1c3Rlcl90YWIgPC0gYXBwbHkoCiAgdHlwZV9jbHVzdGVyX3RhYiwgCiAgMiwgIyBhcHBseSBmdW5jdGlvbiB0byBjb2x1bW5zCiAgXCh4KSB4L3N1bSh4KSAjIGZ1bmN0aW9uIHRvIGFwcGx5CikKIyBwcmludCB0aGUgbm9ybWFsaXplZCB2YWx1ZXMKdHlwZV9jbHVzdGVyX3RhYlsxOjUsIDE6NV0KYGBgCgpOb3cgd2UgY2FuIHBsb3QgdGhlc2UgcmVzdWx0cyBhcyBhIGhlYXRtYXAsIHVzaW5nIHRoZSBgcGhlYXRtYXBgIHBhY2thZ2UuIApUaGVyZSBpcyBhIGxvdCBvZiBjdXN0b21pemF0aW9uIHdlIGNvdWxkIGRvIGhlcmUsIGJ1dCBgcGhlYXRtYXBgIChwcmV0dHkgaGVhdG1hcCkgaGFzIGdvb2QgZGVmYXVsdHMsIHNvIHdlIHdvbid0IHNwZW5kIHRvbyBtdWNoIHRpbWUgb24gaXQgZm9yIG5vdy4KCmBgYHtyIGNsdXN0ZXIgaGVhdG1hcCwgbGl2ZT1UUlVFfQojIHBsb3Qgd2l0aCBwaGVhdG1hcApwaGVhdG1hcDo6cGhlYXRtYXAodHlwZV9jbHVzdGVyX3RhYikKYGBgCgpXZSBjYW4gc2VlIHRoYXQgbW9zdCBvZiBvdXIgY2x1c3RlcnMgYXJlIGluZGVlZCBkZWZpbmVkIGJ5IGEgc2luZ2xlIGNlbGwgdHlwZSwgdGhvdWdoIHRoZXJlIGFyZSBzb21lIGNsdXN0ZXJzIChlLmcuLCAxICYgOSkgdGhhdCBoYXZlIGEgbnVtYmVyIG9mIChyZWxhdGVkKSBjZWxsIHR5cGVzIHdpdGhpbiB0aGVtLgpUaGVyZSBhcmUgYWxzbyBzb21lIHBsYWNlcyB3aGVyZSBzaW5nbGUgY2VsbCB0eXBlcyBhcmUgc3ByZWFkIGFjcm9zcyBhIGZldyBkaWZmZXJlbnQgY2x1c3RlcnMgKENsYXNzaWNhbCBtb25vY3l0ZXMsIGZvciBleGFtcGxlKS4KCiMjIyBDbGFzc2lmeWluZyBieSBjbHVzdGVycwoKV2hpbGUgbW9zdCBvZiB0aGUgdGltZSB3ZSB3aWxsIHdhbnQgdG8gY2xhc3NpZnkgc2luZ2xlIGNlbGxzLCBzb21ldGltZXMgdGhlIHNwYXJzZW5lc3Mgb2YgdGhlIGRhdGEgbWF5IG1lYW4gdGhhdCBpbmRpdmlkdWFsIGNlbGxzIGRvIG5vdCBwcm92aWRlIHJlbGlhYmxlIGVzdGltYXRlcyBvZiBjZWxsIHR5cGVzLiAKCkFuIGFsdGVybmF0aXZlIGFwcHJvYWNoIGlzIHRvIGNsYXNzaWZ5IHRoZSBjbHVzdGVycyBhcyBhIHdob2xlLCBhc3N1bWluZyB0aGF0IHRoZSBjbHVzdGVycyB3ZSBoYXZlIGlkZW50aWZpZWQgcmVwcmVzZW50IGEgc2luZ2xlIGNlbGwgc3RhdGUuCklmIHRoYXQgaXMgdGhlIGNhc2UsIHRoZW4gd2Ugc2hvdWxkIGJlIGFibGUgdG8gY29tYmluZSB0aGUgZGF0YSBmb3IgYWxsIGNlbGxzIGFjcm9zcyBlYWNoIGNsdXN0ZXIsIHRoZW4gYXBwbHkgb3VyIGNlbGwgdHlwaW5nIG1ldGhvZCB0byB0aGlzIGdyb3VwIG9mIGNlbGxzLiAKVGhpcyBpcyBzaW1pbGFyIHRvIGFuIGFwcHJvYWNoIHdlIHdpbGwgcmV0dXJuIHRvIGxhdGVyIGluIHRoZSBjb250ZXh0IG9mIGRpZmZlcmVudGlhbCBleHByZXNzaW9uLiAKClRoZSBmaXJzdCBzdGVwIGhlcmUgaXMgdG8gY3JlYXRlIGEgbmV3IG1hdHJpeCB3aGVyZSB3ZSBzdW0gdGhlIGNvdW50cyBhY3Jvc3MgY2VsbHMgdGhhdCBhcmUgZnJvbSB0aGUgc2FtZSB0eXBlIGFjY29yZGluZyB0byBvdXIgY2x1c3RlcmluZy4KQmVjYXVzZSBgU2luZ2xlUmAgaXMgYSBub24tcGFyYW1ldHJpYyBhcHByb2FjaCwgd2UgY2FuIHBlcmZvcm0gdGhpcyBzdGVwIHdpdGggdGhlIHJhdyBjb3VudHMgbWF0cml4LgpUaGVyZSBhcmUgYSBmZXcgZGlmZmVyZW50IHdheXMgdG8gZG8gdGhpcywgYnV0IHdlIHdpbGwgdXNlIHRoZSBmdW5jdGlvbiBgRGVsYXllZEFycmF5Ojpjb2xzdW0oKWAsIHdoaWNoIGNhbiB3b3JrIGRpcmVjdGx5IG9uIHRoZSBzcGFyc2UgbWF0cmljZXMgdGhhdCBhcmUgb2Z0ZW4gZm91bmQgaW4gU0NFIG9iamVjdHMuCldlIHdpbGwgcHJvdmlkZSBpdCB3aXRoIHRoZSBtYXRyaXggd2UgbmVlZCwgYW5kIHRoZW4gYSB2ZWN0b3Igb2YgdGhlIGNsdXN0ZXIgYXNzaWdubWVudHMgZm9yIGVhY2ggY29sdW1uIG9mIHRoZSBtYXRyaXguClRoZSBmdW5jdGlvbiB3aWxsIHRoZW4gc3VtIGV4cHJlc3Npb24gdmFsdWVzIGZvciBlYWNoIGdlbmUgYWNyb3NzIGFsbCBvZiB0aGUgY29sdW1ucyB0aGF0IGhhdmUgdGhhdCB2YWx1ZS4KCmBgYHtyIHN1bSBjbHVzdGVyc30KIyBzdW0gY291bnQgbWF0cml4IGJ5IGNsdXN0ZXIKY2x1c3Rlcl9tYXQgPC0gRGVsYXllZEFycmF5Ojpjb2xzdW0oY291bnRzKHNjZSksIHNjZSRubl9jbHVzdGVyKQojIHByaW50IG5ldyBkaW1lbnNpb25zCmRpbShjbHVzdGVyX21hdCkKYGBgCgpZb3UgY2FuIHNlZSB0aGF0IHRoZSByZXN1bHRpbmcgbWF0cml4IHN0aWxsIGhhcyB0aGUgc2FtZSBudW1iZXIgb2Ygcm93cyB3ZSBoYXZlIHNlZW4gYmVmb3JlLCBidXQgbm93IG9ubHkgaGFzIGFzIG1hbnkgY29sdW1ucyBhcyB0aGUgbnVtYmVyIG9mIGNsdXN0ZXJzIHRoYXQgdGhlIGNlbGxzIHdlcmUgYXNzaWduZWQgdG8uCgpOb3cgd2UgY2FuIGFwcGx5IHRoZSBzYW1lIGBTaW5nbGVSYCBtb2RlbCB0byB0aGVzZSByZXN1bHRzLCB1c2luZyB0aGUgbmV3IG1hdHJpeCBhcyBpbnB1dCBhbG9uZyB3aXRoIHRoZSBwcmV2aW91c2x5IHRyYWluZWQgbW9kZWwuCkFzIHRoZXJlIGFyZSBvbmx5IDIwIGNsdXN0ZXJzIHRvIGNsYXNzaWZ5LCB0aGlzIHdpbGwgYmUgdmVyeSBxdWljaywgYW5kIHdlIGRvbid0IG5lZWQgdG8gcGFyYWxsZWxpemUgaXQhCgpgYGB7ciBzaW5nbGVyIGNsdXN0ZXIsIGxpdmU9VFJVRX0KIyBydW4gU2luZ2xlUiBjbGFzc2lmaWNhdGlvbiB3aXRoIHByZXZpb3VzbHkgdHJhaW5lZCBtb2RlbApzaW5nbGVyX2NsdXN0ZXIgPC0gU2luZ2xlUjo6Y2xhc3NpZnlTaW5nbGVSKAogIGNsdXN0ZXJfbWF0LCAjIGNsdXN0ZXIgZXhwcmVzc2lvbiBtYXRyaXgKICBzaW5nbGVyX2ZpbmVtb2RlbCAjIHByZS10cmFpbmVkIG1vZGVsCikKCiMgdmlldyByZXN1bHRzCmhlYWQoc2luZ2xlcl9jbHVzdGVyKQpgYGAKClRoZSByZXN1bHQgaXMgYSBmYWlybHkgc21hbGwgdGFibGUgb2YgcmVzdWx0cywgYnV0IHdlIGFyZSBtb3N0IGludGVyZXN0ZWQgaW4gdGhlIGxhYmVscywgd2hpY2ggd2Ugd291bGQgbGlrZSB0byBhc3NvY2lhdGUgd2l0aCBlYWNoIGNlbGwgaW4gb3VyIFNDRSBvYmplY3QgZm9yIHZpc3VhbGl6YXRpb24uClNpbmNlIHRoZSBjbHVzdGVyIGxhYmVscyBhcmUgdGhlIHJvdyBuYW1lcyBvZiB0aGF0IHRhYmxlLCB3ZSBjYW4gcGVyZm9ybSBhIGN1dGUgbGl0dGxlIHRyaWNrIHRvIGFzc2lnbiBsYWJlbHMgYmFjayB0byBlYWNoIGNlbGwgYmFzZWQgb24gdGhlIG5hbWUgb2YgdGhlIGNsdXN0ZXIgdGhhdCBpdCB3YXMgYXNzaWduZWQgdG8uIAooSW4gdGhpcyBjYXNlIHRoZSBjbHVzdGVyIG5hbWVzIGFyZSBhbGwgbnVtYmVycywgYnV0IHRoYXQgbWlnaHQgbm90IGFsd2F5cyBiZSB0aGUgY2FzZS4pCldlJ2xsIHNlbGVjdCB2YWx1ZXMgcmVwZWF0ZWRseSBmcm9tIHRoZSBgc2luZ2xlcl9jbHVzdGVyYCB0YWJsZSwgdXNpbmcgdGhlIGNsdXN0ZXIgYXNzaWdubWVudCB0byBwaWNrIGEgcm93LCBhbmQgdGhlbiBhbHdheXMgcGlja2luZyB0aGUgYHBydW5lZC5sYWJlbHNgIGNvbHVtbi4KCmBgYHtyIGFzc2lnbiBjZWxsIGxhYmVsc30Kc2NlJGNlbGx0eXBlX2NsdXN0ZXIgPC0gc2luZ2xlcl9jbHVzdGVyW3NjZSRubl9jbHVzdGVyLCAicHJ1bmVkLmxhYmVscyJdCmBgYAoKTm93IHdlIGNhbiBwbG90IHRoZXNlIGNsdXN0ZXItYmFzZWQgY2VsbCB0eXBlIGFzc2lnbm1lbnRzIHVzaW5nIHRoZSBub3cgZmFtaWxpYXIgYHBsb3RVTUFQKClgIGZ1bmN0aW9uLgoKYGBge3IgcGxvdCBjbHVzdGVyIGNlbGx0eXBlcywgbGl2ZT1UUlVFfQpzY2F0ZXI6OnBsb3RVTUFQKHNjZSwgY29sb3JfYnkgPSAiY2VsbHR5cGVfY2x1c3RlciIpCmBgYAoKVGhpcyBzdXJlIGxvb2tzIG5pY2UgYW5kIGNsZWFuLCBidXQgd2hhdCBoYXZlIHdlIHJlYWxseSBkb25lIGhlcmU/IApXZSBhcmUgX2Fzc3VtaW5nXyB0aGF0IGVhY2ggY2x1c3RlciBoYXMgb25seSBhIHNpbmdsZSBjZWxsIHR5cGUsIHdoaWNoIGlzIGEgcHJldHR5IGJvbGQgYXNzdW1wdGlvbiwgYXMgd2UgcmVhbGx5IGFyZW4ndCBzdXJlIHRoYXQgdGhlIGNsdXN0ZXJzIHdlIGNyZWF0ZWQgd2VyZSBjb3JyZWN0LgpZb3UgbWF5IHJlY2FsbCB0aGF0IGNsdXN0ZXJpbmcgYWxnb3JpdGhtcyBhcmUgcXVpdGUgc2Vuc2l0aXZlIHRvIHBhcmFtZXRlciBjaG9pY2UsIHNvIGEgZGlmZmVyZW50IHBhcmFtZXRlciBjaG9pY2UgY291bGQgcXVpdGUgbGlrZWx5IGdpdmUgYSBkaWZmZXJlbnQgcmVzdWx0LgoKIyMjIE1ldGFDZWxsIGFwcHJvYWNoZXMKCkFzIGEgbWlkZGxlIGdyb3VuZCBiZXR3ZWVuIHRoZSBwb3RlbnRpYWxseSBtZXNzeSBzaW5nbGUtY2VsbCBjZWxsIHR5cGUgYXNzaWdubWVudCBhbmQgdGhlIGFsbW9zdC1jZXJ0YWlubHkgb3ZlcmNvbmZpZGVudCBjbHVzdGVyLWJhc2VkIGFzc2lnbm1lbnQgYWJvdmUsIHdlIGNhbiB0YWtlIGFwcHJvYWNoIGluc3BpcmVkIGJ5IFtCYXJhbiBfZXQgYWwuXyAoMjAxOSldKGh0dHBzOi8vZG9pLm9yZy8xMC4xMTg2L3MxMzA1OS0wMTktMTgxMi0yKSB1c2luZyBzb21ldGhpbmcgdGhleSBjYWxsZWQgX21ldGFjZWxsc18uIApUaGUgaWRlYSBpcyB0aGF0IHdlIGNhbiBwZXJmb3JtIGZpbmUtc2NhbGVkIGNsdXN0ZXJpbmcgdG8gaWRlbnRpZnkgZ3JvdXBzIG9mIHZlcnkgc2ltaWxhciBjZWxscywgdGhlbiBzdW0gdGhlIGNvdW50cyB3aXRoaW4gdGhvc2UgY2x1c3RlcnMgYXMgIm1ldGFjZWxscyIgdG8gdXNlIGZvciBmdXJ0aGVyIGFuYWx5c2lzLiAKVGhlIG9yaWdpbmFsIHBhcGVyIGluY2x1ZGVzIGEgbnVtYmVyIG9mIG9wdGltaXphdGlvbnMgdG8gbWFrZSBzdXJlIHRoYXQgdGhlIG1ldGFjZWxsIGNsdXN0ZXJzIGhhdmUgZGVzaXJhYmxlIHByb3BlcnRpZXMgZm9yIGRvd25zdHJlYW0gYW5hbHlzaXMuCldlIHdvbid0IGdvIGludG8gdGhhdCBkZXB0aCBoZXJlLCBidXQgd2UgY2FuIGFwcGx5IHNpbWlsYXIgaWRlYXMuCgpUbyBiZWdpbiwgd2Ugd2lsbCBwZXJmb3JtIHNvbWUgZmluZS1zY2FsZSBjbHVzdGVyaW5nLCB1c2luZyBhIHNpbXBsZXIgY2x1c3RlcmluZyBhbGdvcml0aG06IEstbWVhbnMgY2x1c3RlcmluZy4KV2Ugd2lsbCB1c2UgdGhlIHNhbWUgYGJsdXN0ZXJgIHBhY2thZ2UsIGNsdXN0ZXJpbmcgYmFzZWQgb24gdGhlIFBDQSByZXN1bHRzIHdlIGhhdmUgZnJvbSBlYXJsaWVyLCBidXQgdGhpcyBhbGdvcml0aG0gYWxsb3dzIHVzIHRvIHNwZWNpZnkgdGhlIG51bWJlciBvZiBjbHVzdGVycyB3ZSB3YW50IHRvIGVuZCB1cCB3aXRoLgpXZSBoYXZlIGFib3V0IDgwMDAgY2VsbHMsIHNvIGxldCdzIGNsdXN0ZXIgdGhvc2UgaW50byBncm91cHMgb2YgYXBwcm94aW1hdGVseSA4MCBjZWxscywgd2hpY2ggd29ya3Mgb3V0IHRvIDEwMCBjbHVzdGVycy4KV2hpbGUgdGhpcyBpcyBhbG1vc3QgY2VydGFpbmx5IG1vcmUgY2x1c3RlcnMgdGhhbiBhcmUgInJlYWwiIGluIHRoaXMgZGF0YXNldCwgb3VyIGdvYWwgaGVyZSBpcyBub3QgdG8gZmluZCBkaWZmZXJlbmNlcyBhbW9uZyBjbHVzdGVycywganVzdCB0byBnZXQgaG9tb2dlbmVvdXMgZ3JvdXBzIG9mIGNlbGxzLgoKYGBge3Iga21lYW5zIGNsdXN0ZXJ9CiMgcGVyZm9ybSBrLW1lYW5zIGNsdXN0ZXJpbmcKa2NsdXN0ZXJzIDwtIGJsdXN0ZXI6OmNsdXN0ZXJSb3dzKAogIHJlZHVjZWREaW0oc2NlLCAiUENBIiksIAogIGJsdXN0ZXI6OkttZWFuc1BhcmFtKAogICAgY2VudGVycyA9IDEwMCwgIyB0aGUgbnVtYmVyIG9mIGNsdXN0ZXJzIAogICAgaXRlci5tYXggPSAxMDAgIyBtb3JlIGl0ZXJhdGlvbnMgdG8gYmUgc3VyZSBvZiBjb252ZXJnZW5jZQogICkKKQpgYGAKCk5vdyB3ZSBjYW4gYXBwbHkgZXhhY3RseSB0aGUgc2FtZSBhcHByb2FjaCB3ZSBkaWQgd2hlbiB3ZSBoYWQgdGhlIDIwIGNsdXN0ZXJzIHdlIGhhZCBpZGVudGlmaWVkIHdpdGggdGhlIGVhcmxpZXIgZ3JhcGgtYmFzZWQgY2x1c3RlcmluZy4KCmBgYHtyIG1ldGFjZWxsIHNpbmdsZXJ9CiMgY3JlYXRlIGEgIm1ldGFjZWxsIiBtYXRyaXggYnkgc3VtbWluZyBmaW5lLXNjYWxlIGNsdXN0ZXJzCm1ldGFjZWxsX21hdCA8LSBEZWxheWVkQXJyYXk6OmNvbHN1bShjb3VudHMoc2NlKSwga2NsdXN0ZXJzKQoKIyBhcHBseSBTaW5nbGVSIG1vZGVsIHRvIG1ldGFjZWxsIG1hdHJpeAptZXRhY2VsbF9zaW5nbGVyIDwtIFNpbmdsZVI6OmNsYXNzaWZ5U2luZ2xlUigKICBtZXRhY2VsbF9tYXQsIAogIHNpbmdsZXJfZmluZW1vZGVsCikKCiMgYXBwbHkgbWV0YWNlbGwgY2VsbCB0eXBlIGFzc2lnbm1lbnRzIHRvIGluZGl2aWR1YWwgY2VsbHMKc2NlJGNlbGx0eXBlX21ldGFjZWxsIDwtIG1ldGFjZWxsX3NpbmdsZXJba2NsdXN0ZXJzLCAicHJ1bmVkLmxhYmVscyJdCmBgYAoKTm93IHdlIGNhbiBwbG90IHRoZSByZXN1bHRzIGFzIHdlIGhhdmUgZG9uZSBiZWZvcmUuCgpgYGB7ciBtZXRhY2VsbCB1bWFwfQpzY2F0ZXI6OnBsb3RVTUFQKHNjZSwgY29sb3JfYnkgPSAiY2VsbHR5cGVfbWV0YWNlbGwiKQpgYGAKCldoYXQgZG8geW91IHRoaW5rIG9mIHRoaXMgcGxvdD8gCklzIHRoaXMgbW9yZSBvciBsZXNzIHVzZWZ1bCB0aGFuIHRoZSBvcmlnaW5hbCBjZWxsLWJhc2VkIGNsdXN0ZXJpbmc/CgoKIyMgU2F2ZSByZXN1bHRzCgpUbyBzYXZlIGRpc2sgc3BhY2UgKGFuZCB0aW1lKSwgd2Ugd29uJ3Qgd3JpdGUgb3V0IHRoZSB3aG9sZSBTQ0Ugb2JqZWN0LCBhcyB3ZSBoYXZlbid0IGNoYW5nZWQgYW55IG9mIHRoZSBjb3JlIGRhdGEgdGhlcmUuIApJbnN0ZWFkIHdlIHdpbGwganVzdCB3cml0ZSBvdXQgdGhlIGNlbGwgaW5mb3JtYXRpb24gdGFibGUgKGBjb2xEYXRhYCkgYXMgYSBUU1YgZmlsZS4KCmBgYHtyIHNhdmUgY2VsbCBpbmZvLCBsaXZlPVRSVUV9CmNvbERhdGEoc2NlKSB8PgogIGFzLmRhdGEuZnJhbWUoKSB8PgogIHJlYWRyOjp3cml0ZV90c3YoZmlsZSA9IGNlbGxpbmZvX2ZpbGUpCmBgYAoKCiMjIFByaW50IHNlc3Npb24gaW5mbwoKYGBge3Igc2Vzc2lvbiBpbmZvfQpzZXNzaW9uSW5mbygpCmBgYAoK
+ + +
+
+ +
+ + + + + + + + + + + + + + + + +