-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
277 lines (158 loc) · 8.8 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
# infixit <img src="man/figures/logo.png" align="right" height="138"/>
<!-- badges: start -->
[![R-CMD-check](https://github.com/prlitics/infixit/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/prlitics/infixit/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/prlitics/infixit/branch/main/graph/badge.svg)](https://app.codecov.io/gh/prlitics/infixit?branch=main)
<!-- badges: end -->
```{r setup, include=FALSE}
library(infixit)
knitr::opts_chunk$set(echo = TRUE)
options(infixit.paste = "paste0")
options(infixit.btwn = c("[","]"))
```
> Infix functions get their name from the fact the function name comes inbetween its arguments, and hence have two arguments.
>
> --- Hadley Wickham, [Advanced R](https://adv-r.hadley.nz/functions.html#:~:text=Infix%20functions%20get%20their%20name,%2C%20%3C%3D%20%2C%20%3D%3D%20%2C%20!%3D)
`{infixit}` is an R package that looks to extend the available infix operators in R with some that are helpful for programming and data manipulation tasks. Additionally, the specific behaviors of many of these operators are (to an extent) customizable using some of the package's `options()`.
It can be downloaded through the following:
```{r install, eval=FALSE}
library(remotes)
install_github("prlitics/infixit")
```
There are currently 8 infix functions packaged with `{infixit}`:
1. `%+%`: Providing string concatenation.
2. `%nin%`: Providing the inverse of the `%in%` function (e.g., whether an element of $X$ _is **not**_ in $Y$).
3. `%btwn%`: Lets users determine if numeric elements (including date objects) of $X$ are **b**e**tw**ee**n** values $Y_1$ and $Y_2$.
4. Five augmented assignment operations which take the left-hand object and reassigns its value based off the right-hand value. For example, let's say we have an object `apple` with a value of 12. `apple %+=% 1` takes the current value of `apple`, adds 1 to it (12 + 1 = 13), and then updates the value of `apple` to this sum. The five operations are:
* `%+=%`: Updates left-hand object by *adding* it to the right-hand object.
* `%-=%`: Updates left-hand object by *subtracting* it from the right-hand object.
* `%/=%`: Updates left-hand object by *dividing* it by the right-hand object.
* `%*=%`: Updates left-hand object by *multiplying* it by the right-hand object.
* `%^=%`: Updates left-hand object by *exponentiating* it by the right-hand object.
While there are ways to achieve the end-behaviors of these functions, the intent is to do so in a way that maximizes the ease of coders and the readability of the code they produce.
## Code Examples
### `%+%` (String Concatenation)
Use `%+%` to paste strings together without wrapping it within `paste0` or `paste`.
```{r concat}
a <- "Hello "
b <- "world!"
c <- " Let's do our best!"
a %+% b %+% c
```
However, it some instances, users might apprciate having a bit of extra padding to their strings, such as pasting full sentences together. By default `%+%` wraps around `paste0`, but`infixit` has the option for users to specify `paste` as the concatenation method, which will add a space between pasted objects.
```{r concat2}
options(infixit.paste = "paste") #default is paste0
a <- "Hello" #Notice no trailing space here to accommodate "world"
b <- "world!"
c <- "Let's do our best!" #No leading space here.
a %+% b %+% c
```
In cases where the user wants to use a different separator when using `"paste"` rather than the default `" "`, they can specify this using the `infixit.paste_sep` option. In the example below, this is done to change the delimiter to be a vertical pipe ("|"):
```{r concat3}
options(infixit.paste_sep = "|") #default is paste0
a <- "Hello"
b <- "world!"
c <- "Let's do our best!"
a %+% b %+% c
```
### `%nin%` (Not in)
Normally, you can use `%in%` to check if your left-hand-side element is **in** the set of elments on the right hand side.
```{r in}
fruits <- c("apple", "banana", "kiwi", "pears")
"apple" %in% fruits
"tomato" %in% fruits
```
Use `%nin%` when you want to select features that are **n**ot **in** a set of values.
```{r nin}
fruits <- c("apple", "banana", "kiwi", "pears")
"apple" %nin% fruits
"tomato" %nin% fruits
```
`%nin` can be useful when you're filtering data. Let's say, for example, that you are working with the `penguins` data from the [`{palmerpenguins}`](https://allisonhorst.github.io/palmerpenguins/) package. You are only interested in penguins that are not from either "Torgersen" nor "Biscoe" islands.
```{r nin_filter, warning=FALSE}
suppressPackageStartupMessages(library(palmerpenguins))
suppressPackageStartupMessages(library(dplyr))
penguins %>%
filter(island %nin% c("Torgersen", "Biscoe")) %>%
count(island)
```
Now the data are only limited to "Dream" island. Though a relatively simple example, this functionality can be especially useful if you have a long list of things that a value could be `%in%`.
### `%btwn%` (Whether a numeric value is between two others)
Use `%btwn%` to determine whether values on the left-hand-side are within the bounds defined on the right-hand-side. `%btwn%` can accomodate integer, double numeric-types as well as strings that can be coerced into a date or POSIXlt object: basically anything that, at the end of the day, can be coerced to a numeric value.
```{r btwn}
c(1,2,3.5,4.2,5,6) %btwn% c(2,4)
dates_seq <- seq(as.Date("2020-01-01"),as.Date("2021-03-31"), by = "month")
dates_seq %btwn% c("2019-12-31","2021-01-01")
```
You can pass unique datetime formats for the comparison set via the "infixit.btwn.datetimefmt" option.
```{r btwn_dates}
options(infixit.btwn.datetimefmt = "%b %d, %Y")
dates_seq %btwn% c("Dec 31, 2019", "Jan 01, 2021")
```
By default, the bounds on the right-hand-side are considered **_inclusive_**, meaning that if a left-hand-side value matches one of the two bounds, it will return as `TRUE`. In mathematics, inclusivity can be denoted with square brackets "[" or "]". It is possible to change the behavior such that the values are **exclusive**, meaning that an exact match for the specified boundary value will return as `FALSE`.
The "infixit.btwn" option that is loaded with the package allows users to define whether the lower boundary is (in/ex)clusive and/or whether the upper boundary is (in/ex)clusive.
```{r btwn_opts}
options(infixit.btwn = c("[","]")) #inclusive left and right, default
c(1,2,3,4,5) %btwn% c(2,4)
options(infixit.btwn = c("[",")")) #inclusive left, exclusive right
c(1,2,3,4,5) %btwn% c(2,4)
options(infixit.btwn = c("(","]")) #exclusive left, inclusive right
c(1,2,3,4,5) %btwn% c(2,4)
options(infixit.btwn = c("(",")")) #exclusive left and right
c(1,2,3,4,5) %btwn% c(2,4)
```
`%btwn%` can be especially helpful in the context of `{dplyr}`'s `case_when` function. Let's imagine that we are trying to group penguins by body mass (chunk 'em by chonk, one might say). We want to put them into quartiles; 0-24.99% of the sample, 25-49.99% of the sample, 50-74.99% of the sample, and 75% to the sample max. We can discover these values using the `quantile` function.
```{r penguin_mass}
quantile(penguins$body_mass_g, na.rm = TRUE)
```
A normal way to do this with `case_when` would be:
```{r case_standard}
penguins %>%
mutate(chonk_level = case_when(
body_mass_g < 3550 ~ 1,
body_mass_g >= 3550 & body_mass_g < 4050 ~ 2,
body_mass_g >= 4050 & body_mass_g < 4750 ~ 3,
body_mass_g >= 4750 ~ 4
)) %>%
select(body_mass_g, chonk_level) %>%
head()
```
With `%btwn%`:
```{r case_btwn}
options(infixit.btwn = c("[",")"))
penguins %>%
mutate(chonk_level = case_when(
body_mass_g < 3550 ~ 1,
body_mass_g %btwn% c(3550, 4050) ~ 2,
body_mass_g %btwn% c(4050, 4750) ~ 3,
body_mass_g >= 4750 ~ 4
)) %>%
select(body_mass_g, chonk_level) %>%
head()
```
### Augmented arithmetic reasignment (`%+=%` and kin)
In languages such as Python, it is possible to take an object with a numeric value and update/reassign it with a single operation. So, for example, if I had `test = 5`, I could do `test += 5` and then my new value of `test` would be 10. This sort of behavior is called "augmented assignment", and it can be very useful when doing things in loops.
In R, you currently would have to reassign the value like this: `test <- test + 5`. Some programmers find this to be more verbose than it needs to be. So, for example:
```{r updating_paste, echo=FALSE}
options(infixit.paste_sep = "")
```
```{r old_assignment_v_new}
v1 <- 0
v2 <- 0
for (i in 1:5) {
v1 <- v1 + i
v2 %+=% i
print("v1 is " %+% v1 %+% " and v2 is " %+% v2)
}
identical(v1, v2)
```
This functionality offers some fun and interesting possibilities for updating vectors as well:
```{r aug_vec_assignment}
v1 <- 1:5
v1 %*=% 2
print(v1)
v1 %-=% 1:5
print(v1)
```