Skip to content

Commit

Permalink
Improve help and vignette for isolate_chop().
Browse files Browse the repository at this point in the history
  • Loading branch information
hughjonesd committed Jun 7, 2024
1 parent 8314617 commit fb08f13
Show file tree
Hide file tree
Showing 3 changed files with 59 additions and 2 deletions.
10 changes: 10 additions & 0 deletions R/chop.R
Original file line number Diff line number Diff line change
Expand Up @@ -674,6 +674,9 @@ chop_spikes <- function (
#' `isolate_chop()` does not typically chop `x` into disjoint intervals. See
#' the examples.
#'
#' If breaks are data-dependent, their labels may be misleading after common
#' elements have been removed. See the example below.
#'
#' Levels of the result are ordered by the minimum element in each level. As
#' a result, if `drop = FALSE`, empty levels will be placed last.
#'
Expand Down Expand Up @@ -702,6 +705,13 @@ chop_spikes <- function (
#' table(isolate_chop(x, brk_width(2, 0), prop = 0.05))
#' # Versus:
#' tab_spikes(x, brk_width(2, 0), prop = 0.05)
#'
#' # Misleading data-dependent breaks:
#' set.seed(42)
#' x <- rnorm(99)
#' x[1:10] <- x[1]
#' tab_quantiles(x, 1:2/3)
#' table(isolate_chop(x, brk_quantiles(1:2/3), prop = 0.1))
isolate_chop <- function (x,
breaks,
...,
Expand Down
10 changes: 10 additions & 0 deletions man/isolate_chop.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 39 additions & 2 deletions vignettes/santoku.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ To quickly produce a table of chopped data, use `tab()`:
tab(1:10, c(2, 5, 8))
```

## More ways to chop
## Chopping by width and number of elements

To chop into fixed-width intervals, starting at the minimum value, use
`chop_width()`:
Expand All @@ -90,7 +90,7 @@ chopped <- chop_evenly(x, intervals = 3)
data.frame(x, chopped)
```

To chop into groups with a fixed number of members, use `chop_n()`:
To chop into groups with a fixed number of elements, use `chop_n()`:

```{r}
chopped <- chop_n(x, 4)
Expand Down Expand Up @@ -131,6 +131,8 @@ intervals of different specific sizes... | `chop_quantiles()` | `chop_proporti

: Different ways to chop by size

## Even more ways to chop

To chop data by standard deviations around the mean, use `chop_mean_sd()`:

```{r}
Expand All @@ -147,6 +149,41 @@ chopped <- chop_pretty(x)
data.frame(x, chopped)
```


## Isolating common values

In exploratory work, it's sometimes useful to find common values and
treat them differently. You can use `isolate_chop()` to do this:

```{r}
x_spike <- rnorm(100)
x_spike[1:50] <- x_spike[1]
chopped <- isolate_chop(x_spike, -3:3, prop = 0.1)
table(chopped)
```

`prop = 0.2` will put any unique value of `x` into its own separate
category if it makes up at least 20% of the data.

Note that unlike all the other `chop_*` functions, `isolate_chop()`
doesn't always categorize `x` into ordered, connected intervals.
To remind you of this, it is named differently. If you want to create
separate intervals on the left and right of common elements, use
`chop_spikes()`:

```{r}
chopped <- chop_spikes(x_spike, -3:3, prop = 0.1)
table(chopped)
```

Compare this to the table before. There are two intervals on either
side of the common value, instead of one interval surrounding it.


## Quick tables

`tab_n()`, `tab_width()`, and friends act similarly to
`tab()`, calling the related `chop_*` function and then `table()` on the result.

Expand Down

0 comments on commit fb08f13

Please sign in to comment.