-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME.Rmd
109 lines (86 loc) · 3.47 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
title: "README"
author: "Jordan S Read"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
md_document:
variant: markdown_github
---
##Installation
To install the stable version of `sensorQC` package with dependencies:
```{r echo=TRUE, eval=FALSE}
install.packages("sensorQC",
repos = c("http://owi.usgs.gov/R","http://cran.rstudio.com/"),
dependencies = TRUE)
```
Or to install the current development version of the package (using the `devtools` package):
```{r echo=TRUE, eval=FALSE}
devtools::install_github("USGS-R/sensorQC")
```
This package is still very much in development, so the API may change at any time.
| Name | Status |
| :------------ |:-------------|
| Linux Build: | [![Build Status](https://travis-ci.org/USGS-R/sensorQC.svg?branch=master)](https://travis-ci.org/USGS-R/sensorQC) |
| Windows Build: | [![Build status](https://ci.appveyor.com/api/projects/status/pho8872wbnvaw5nt)](https://ci.appveyor.com/project/jread-usgs/sensorqc) |
| Package Tests: | [![Coverage Status](https://coveralls.io/repos/USGS-R/sensorQC/badge.svg?branch=master&service=github)](https://coveralls.io/github/USGS-R/sensorQC?branch=master) |
High-frequency aquatic sensor QAQC procedures. `sensorQC` imports data, and runs various statistical outlier detection techniques as specified by the user.
###`sensorQC` Functions (as of v`r packageVersion('sensorQC')`)
| Function | Title |
| ------------- |:-------------|
| `read` | read in a file for sensor data or a config (.yml) file |
| `window` | window sensor data for processing in chunks |
| `plot` | plot sensor data |
| `flag` | create data flags for a sensor |
| `clean` | remove or replace flagged data points |
### example usage
```{r}
library(sensorQC)
file <- system.file('extdata', 'test_data.txt', package = 'sensorQC')
sensor <- read(file, format="wide_burst", date.format="%m/%d/%Y %H:%M")
flag(sensor, 'x == 999999', 'persist(x) > 3', 'is.na(x)')
```
Use the `MAD` (median absolute deviation) test, and add `w` to the function call to specify "windows" (note, sensor must be windowed w/ `window()` prior to using `w`)
```{r}
sensor = window(sensor, type='auto')
flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')
```
Use sensorQC with a simple vector of numbers:
```{r}
flag(c(3,2,4,3,3,4,2,4),'MAD(x) > 3')
```
#### plotting data
plot dataset w/ outliers:
```{r}
plot(sensor)
```
plot dataset w/o outliers:
```{r}
flagged = flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')
plot(flagged)
```
#### cleaning data
The `clean` function can be used to strip flagged data points from the record or replace them with other values (such as `NA` or -9999)
```{r}
data = c(999999, 1,2,3,4,2,3,4)
sensor = flag(data, 'x > 9999')
clean(sensor)
clean(sensor, replace=NA)
```
if you have multiple flag rules, you can choose which ones to use by their index:
```{r}
data = c(999999, 1,2,3,4,2,3,4)
sensor = flag(data, 'x > 9999', 'x == 3')
clean(sensor, which=1)
clean(sensor, which=2)
```
or flag data and clean data all in one step:
```{r}
clean(data, 'x > 9999', 'persist(x) > 10', 'MAD(x) > 3', replace=NA)
```
#### flagging data with a moving window
The `MAD(x,w)` function can use a rolling window by leveraging the `RcppRoll` R package.
```{r}
sensor <- read(file, format="wide_burst", date.format="%m/%d/%Y %H:%M")
sensor = window(sensor, n=300, type='rolling')
flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')
```