-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathGetting_started.Rmd
112 lines (77 loc) · 7.17 KB
/
Getting_started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
title: Getting started in R
author: "Alexander Spina and Patrick Keating"
output:
html_document:
toc: yes
toc_float:
collapsed: no
smooth_scroll: yes
pdf_document: default
word_document: default
geometry: margin=1.5cm
---
The following code has been adapted to *R* for learning purposes. The initial contributors and copyright license are listed below. All copyrights and licenses of the original document apply here as well.
**Contributors to *R* code:**
Daniel Gardiner(PHE) and Lukas Richter (AGES)
**Updated to the 2018 edition by:**
Ashley Sharp (PHE) and Hikaru Bolt (PHE)
#Copyright and license
This case study was designed under an ECDC service contract for the development of training material (2010). The data were slightly modified for training purposes.
**You are free:**
+ to Share - to copy, distribute and transmit the work
+ to Remix - to adapt the work
Under the following conditions:
+ Attribution - You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). The best way to do this is to keep as it is the list of contributors: sources, authors and reviewers.
+ Share Alike - If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. Your changes must be documented. Under that condition, you are allowed to add your name to the list of contributors.
+ You cannot sell this work alone but you can use it as part of a teaching.
With the understanding that:
+ Waiver - Any of the above conditions can be waived if you get permission from the copyright holder.
+ Public Domain - Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
+ Other Rights - In no way are any of the following rights affected by the license:
+ Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations;
+ The author's moral rights;
+ Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights.
+ Notice - For any reuse or distribution, you must make clear to others the license terms of this work by keeping together this work and the current license.
This licence is based on http://creativecommons.org/licenses/by-sa/3.0/
#An introduction to the R companion#
“To understand computations in R, two slogans are helpful:
- Everything that exists is an object.
- Everything that happens is a function call.
John Chambers
If you look at the Global Environment panel (by default in the upper right of the screen) you will see a list of objects stored in that environment. When you load your data in R you create an object. This is completely separate from the data file itself (the excel file, or csv file etc). You can create as many objects as you like, for example you could store a few variables from your original data as a new object, or create a summary table and store that.
Functions in R are equivalent to commands in STATA. All functions take the form of a name followed by brackets e.g. functionname(). Inside the brackets go various arguments. You can access the help file for a function by calling ?functionname. The help file will show which arguments the function takes and what the function does. Arguments have a default order, as specified in the help file, though you can override this by specifying which argument you are entering using the equals sign "=".
A good reference for R users is the book R for Data Science by Garrett Grolemund and Hadley Wickham. This is available free online at http://r4ds.had.co.nz/.
###RStudio projects
The easiest way to work with R is using RStudio 'projects'. RStudio is a graphical user interface that runs R in the background. A 'project' is an RStudio file that saves your workspace so you can easily pick up from where you left off. Put all the files that you will need for this case study in a folder called 'Copenhagen' and create a project in the same folder by clicking file -> new project -> existing directory, and choosing the folder. For simplicity, make sure there are no subfolders in this folder, and put all data and scripts in the main Copenhagen folder.
###Setting your working directory
Just as in STATA you can set a folder to be your working directory (using the setwd() command). Open the project that you've created and you will see that the working directory is the same as folder itself: you can check this by calling getwd().You can see what's in your working directory by looking at the **Files tab** (by default in the bottom right area of the screen). If you want to set your working directory you use the function setwd("C:/Users/yourname/Desktop/Copenhagen"). Note that R paths use forward slashes "/", while windows paths use back slashes "\\" so if you copy a path from windows you have to change them manually.
```r
getwd()
```
###Installing packages and functions
R packages are bundles of functions which extend the capability of R. Thousands of add-on packages are available in the main online repository (known as CRAN) and many more packages in development can be found on GitHub. They may be installed and updated over the Internet.
We will mainly use packages which come ready installed with R (base code), but where it makes things easier we will use add-on packages. In addition, we have included a few extra functions to simplify the code required.
Run the following code at the beginning of the day to make sure that you have made available all the packages and functions that you need. Be sure to include it in any scripts too.
```r
# Loading required packages
library(ggplot2)
library(lubridate)
library(foreign)
library(Hmisc)
library(epiR)
library(epitools)
```
Note: This code assumes the packages are already installed. You only need to install a package once. If this code does not work, try installing the packages using the code below.
```
install.packages("package name in quotation makrs")
```
R and Stata have minor differences in default settings and methods. In this document we will follow the Stata analysis as closely as possible, but small and usually unimportant differences may be noted between the statistical findings in R and those in Stata. At some points additional steps (which would usually be optional in R) will be taken to produce output which is comparable to that of Stata.
The code below will run two scripts that contain useful user-written functions. These scripts need to be present in your working directory. The epicurve function allows creation of easily formatted epicurves. To find out more about the function, first load it as above and then click on function in the **Global Environment** tab on the right of the R Studio window. The **single variable analysis** function allows calculation of attack rates of multiple variables at one time and provides similar output to the cctable and cstable commands in Stata.
```
#These scripts need to be present in your working directory
# Adds a function to create epicurves
source("epicurve.v.1.8.R")
# Adds a function to create output similar to cctable or cstable in Stata
source("single.variable.analysis.v0.2.R")
```