-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathREADME.Rmd
157 lines (114 loc) · 5.03 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# R/`origami` <img src="./hex/origami-sticker.png" align="right" width='125'/>
[![R-CMD-check](https://github.com/tlverse/origami/workflows/R-CMD-check/badge.svg)](https://github.com/tlverse/origami/actions)
[![Coverage Status](https://codecov.io/gh/tlverse/origami/branch/master/graph/badge.svg)](https://codecov.io/gh/tlverse/origami)
[![CRAN](http://www.r-pkg.org/badges/version/origami)](http://www.r-pkg.org/pkg/origami)
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/origami)](https://CRAN.R-project.org/package=origami)
[![CRAN total downloads](http://cranlogs.r-pkg.org/badges/grand-total/origami)](https://CRAN.R-project.org/package=origami)
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](http://www.gnu.org/licenses/gpl-3.0)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1155901.svg)](https://doi.org/10.5281/zenodo.1155901)
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00512/status.svg)](https://doi.org/10.21105/joss.00512)
> High-powered framework for cross-validation. Fold your data like it's paper!
__Authors:__ [Jeremy Coyle](https://github.com/jeremyrcoyle), [Nima
Hejazi](https://nimahejazi.org), [Ivana
Malenica](https://github.com/podTockom), and [Rachael
Phillips](https://github.com/rachaelvphillips)
---
## What's `origami`?
The `origami` R package provides a general framework for the application of
cross-validation schemes to particular functions. By allowing arbitrary lists of
results, `origami` accommodates a range of cross-validation applications.
---
## Installation
For standard use, we recommend installing the package from
[CRAN](https://cran.r-project.org/) via
```{r cran-installation, eval = FALSE}
install.packages("origami")
```
You can install a stable release of `origami` from GitHub via
[`devtools`](https://www.rstudio.com/products/rpackages/devtools/) with:
```{r gh-installation, eval = FALSE}
devtools::install_github("tlverse/origami")
```
---
## Usage
For details on how best to use `origami`, please consult the package
[documentation](https://origami.tlverse.org) and [introductory
vignette](https://origami.tlverse.org/articles/generalizedCV.html)
online, or do so from within [R](https://www.r-project.org/).
---
## Example
This minimal example shows how to use `origami` to apply cross-validation to the
computation of a simple descriptive statistic using a sample data set. In
particular, we obtain a cross-validated estimate of the mean:
```{r example}
library(stringr)
library(origami)
set.seed(4795)
data(mtcars)
head(mtcars)
# build a cv_fun that wraps around lm
cv_lm <- function(fold, data, reg_form) {
# get name and index of outcome variable from regression formula
out_var <- as.character(unlist(str_split(reg_form, " "))[1])
out_var_ind <- as.numeric(which(colnames(data) == out_var))
# split up data into training and validation sets
train_data <- training(data)
valid_data <- validation(data)
# fit linear model on training set and predict on validation set
mod <- lm(as.formula(reg_form), data = train_data)
preds <- predict(mod, newdata = valid_data)
# capture results to be returned as output
out <- list(coef = data.frame(t(coef(mod))),
SE = ((preds - valid_data[, out_var_ind])^2))
return(out)
}
folds <- make_folds(mtcars)
results <- cross_validate(cv_fun = cv_lm, folds = folds, data = mtcars,
reg_form = "mpg ~ .")
mean(results$SE)
```
For details on how to write wrappers (`cv_fun`s) for use with
`origami::cross_validate`, please consult the documentation and vignettes that
accompany the package.
---
## Issues
If you encounter any bugs or have any specific feature requests, please [file an
issue](https://github.com/tlverse/origami/issues).
---
## Contributions
Contributions are very welcome. Interested contributors should consult our
[contribution
guidelines](https://github.com/tlverse/origami/blob/master/CONTRIBUTING.md)
prior to submitting a pull request.
---
## Citation
After using the `origami` R package, please cite it:
@article{coyle2018origami,
author = {Coyle, Jeremy R and Hejazi, Nima S},
title = {origami: A Generalized Framework for Cross-Validation in R},
journal = {The Journal of Open Source Software},
volume = {3},
number = {21},
month = {January},
year = {2018},
publisher = {The Open Journal},
doi = {10.21105/joss.00512},
url = {https://doi.org/10.21105/joss.00512}
}
---
## License
© 2017-2021 [Jeremy R. Coyle](https://github.com/jeremyrcoyle)
The contents of this repository are distributed under the GPL-3 license. See
file `LICENSE` for details.