-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathoverview.Rpres
62 lines (50 loc) · 2.19 KB
/
overview.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Data design patterns for data projects
======================================
css: style.css
[Pierce Edmiston](https://github.com/pedmiston)
```{r config, echo = FALSE}
library(knitr)
opts_chunk$set(
echo = FALSE,
fig.align = "center"
)
```
A quick poll
============
incremental: true
- Mac | Windows | Linux
- R | RStudio
- RStudio projects? (.Rproj)
- Custom R packages? "library(my_stuff)"
- terminal?
- pandoc?
Spirit animal
=============
type: center
![The platypus](img/platypus.jpg)
What is a data project?
=======================
- **Reproducible**. Easy to redo everything on any computer.
- **Replicable**. New data, same code.
- **Reuseable**. New context, same tools.
> A data project is not a data set.
What is a design pattern?
=========================
> Structure for your code decided by someone else.
- Upfront cost to prevent likely headaches.
- Unified structure helps when debugging.
- Easy to implement new features.
Design patterns for data projects
=================================
- **[Egg projects](https://github.com/pedmiston/data-design-patterns/tree/master/design-patterns/egg-projects)**. A useful configuration of R packages and RStudio projects.
- **[Parallel reports](https://github.com/pedmiston/data-design-patterns/tree/master/design-patterns/parallel-reports)**. The untangling of code and report writing for
improved interactive development.
- **[Merge recode](https://github.com/pedmiston/data-design-patterns/tree/master/design-patterns/merge-recode)**. Authoritative recoder functions.
- **[Green stats](https://github.com/pedmiston/data-design-patterns/tree/master/design-patterns/green-stats)**. Result sections in knitr. Could save your life!
- **[DRY plots](https://github.com/pedmiston/data-design-patterns/tree/master/design-patterns/dry-plots)**. A small pattern for getting the most out of ggplot.
Example data projects
=====================
- **[github-pulse](https://github.com/pedmiston/github-pulse)**. Analyze Github event data.
- **[property-verification](https://github.com/lupyanlab/property-verification)**. A reproducible psychology experiment.
- **[wikischolar](https://github.com/evoapps/wikischolar)**. Evolution of Wikipedia article quality.