Improve help and vignette for isolate_chop().

hughjonesd · Jun 7, 2024 · fb08f13 · fb08f13
1 parent 8314617
commit fb08f13
Show file tree

Hide file tree

Showing 3 changed files with 59 additions and 2 deletions.
diff --git a/R/chop.R b/R/chop.R
@@ -674,6 +674,9 @@ chop_spikes <- function (
 #' `isolate_chop()` does not typically chop `x` into disjoint intervals. See
 #' the examples.
 #'
+#' If breaks are data-dependent, their labels may be misleading after common
+#' elements have been removed. See the example below.
+#'
 #' Levels of the result are ordered by the minimum element in each level. As
 #' a result, if `drop = FALSE`, empty levels will be placed last.
 #'
@@ -702,6 +705,13 @@ chop_spikes <- function (
 #' table(isolate_chop(x, brk_width(2, 0), prop = 0.05))
 #' # Versus:
 #' tab_spikes(x, brk_width(2, 0), prop = 0.05)
+#'
+#' # Misleading data-dependent breaks:
+#' set.seed(42)
+#' x <- rnorm(99)
+#' x[1:10] <- x[1]
+#' tab_quantiles(x, 1:2/3)
+#' table(isolate_chop(x, brk_quantiles(1:2/3), prop = 0.1))
 isolate_chop <- function (x,
                           breaks,
                           ...,

diff --git a/man/isolate_chop.Rd b/man/isolate_chop.Rd
diff --git a/vignettes/santoku.Rmd b/vignettes/santoku.Rmd
@@ -72,7 +72,7 @@ To quickly produce a table of chopped data, use `tab()`:
 tab(1:10, c(2, 5, 8))
 ```
 
-## More ways to chop
+## Chopping by width and number of elements
 
 To chop into fixed-width intervals, starting at the minimum value, use
 `chop_width()`:
@@ -90,7 +90,7 @@ chopped <- chop_evenly(x, intervals = 3)
 data.frame(x, chopped)
 ```
 
-To chop into groups with a fixed number of members, use `chop_n()`:
+To chop into groups with a fixed number of elements, use `chop_n()`:
 
 ```{r}
 chopped <- chop_n(x, 4)
@@ -131,6 +131,8 @@ intervals of different specific sizes...   | `chop_quantiles()` | `chop_proporti
 
 : Different ways to chop by size
 
+## Even more ways to chop
+
 To chop data by standard deviations around the mean, use `chop_mean_sd()`:
 
 ```{r}
@@ -147,6 +149,41 @@ chopped <- chop_pretty(x)
 data.frame(x, chopped)
 ```
 
+
+## Isolating common values
+
+In exploratory work, it's sometimes useful to find common values and
+treat them differently. You can use `isolate_chop()` to do this:
+
+```{r}
+x_spike <- rnorm(100)
+x_spike[1:50] <- x_spike[1]
+
+chopped <- isolate_chop(x_spike, -3:3, prop = 0.1)
+table(chopped)
+
+```
+
+`prop = 0.2` will put any unique value of `x` into its own separate
+category if it makes up at least 20% of the data.
+
+Note that unlike all the other `chop_*` functions, `isolate_chop()`
+doesn't always categorize `x` into ordered, connected intervals. 
+To remind you of this, it is named differently. If you want to create
+separate intervals on the left and right of common elements, use
+`chop_spikes()`:
+
+```{r}
+chopped <- chop_spikes(x_spike, -3:3, prop = 0.1)
+table(chopped)
+```
+
+Compare this to the table before. There are two intervals on either
+side of the common value, instead of one interval surrounding it.
+
+
+## Quick tables
+
 `tab_n()`, `tab_width()`, and friends act similarly to
 `tab()`, calling the related `chop_*` function and then `table()` on the result.