Skip to content

Commit

Permalink
improve documentation of tidy methods
Browse files Browse the repository at this point in the history
  • Loading branch information
EmilHvitfeldt committed Feb 24, 2024
1 parent 238a924 commit d64f7f6
Show file tree
Hide file tree
Showing 29 changed files with 294 additions and 76 deletions.
6 changes: 4 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,16 @@

* `step_umap()` has gained `initial` and `target_weight` arguments. (#213)

* Calling `?tidy.step_*()` now sends you to the documentation for `step_*()` where the outcome is documented. (#216)

* Documentation for tidy methods for all steps has been improved to describe the return value more accurately. (#217)

# embed 1.1.3

* `step_collapse_stringdist()` will now return predictors as factors. (#204)

* Fixed regression from 1.1.2 in `step_lencode_glm()` where it couldn't be used on multiple columns.

* Calling `?tidy.step_*()` now sends you to the documentation for `step_*()` where the outcome is documented. (#216)

# embed 1.1.2

## Improvements
Expand Down
15 changes: 10 additions & 5 deletions R/collapse_cart.R
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,16 @@
#' find any signal in the data.
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `"terms"`
#' (the column being modified), `"old"` (the old levels), `"new"` (the new
#' levels), and `"id"`. If the CART model failed or could not find a good split,
#' the requested predictor will not be in the results.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `old`, `new`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{old}{character, the old levels}
#' \item{new}{character, the new levels}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-not-supported
#'
Expand Down
14 changes: 10 additions & 4 deletions R/collapse_stringdist.R
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,16 @@
#' @details
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `"terms"`
#' (the column being modified), `"from"` (the old levels), `"to"` (the new
#' levels), and `"id"`.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `from`, `to`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{from}{character, the old levels}
#' \item{too}{character, the new levels}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-not-supported
#'
Expand Down
12 changes: 9 additions & 3 deletions R/discretize_cart.R
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,15 @@
#' Note that the original data will be replaced with the new bins.
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the columns that is selected), `values` is returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `value`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{numeric, location of the splits}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_discretize_cart"
Expand Down
12 changes: 9 additions & 3 deletions R/discretize_xgb.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,15 @@
#' Note that the original data will be replaced with the new bins.
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the columns that is selected), `values` is returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `value`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{numeric, location of the splits}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_discretize_xgb"
Expand Down
14 changes: 10 additions & 4 deletions R/embed.R
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,16 @@
#' this step with `caret`, avoid parallel processing.
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `levels` (levels in variable), and a
#' number of columns with embedding information are returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' a number of columns with embedding information, and columns `terms`,
#' `levels`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{levels}{character, levels in variable}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_embed"
Expand Down
11 changes: 8 additions & 3 deletions R/feature_hash.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,14 @@
#' [recipes::step_zv()]) is recommended for any recipe that uses hashed columns.
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the columns that is selected) is returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms` and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-not-supported
#'
Expand Down
15 changes: 11 additions & 4 deletions R/lencode_bayes.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,16 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `value` and `component` is returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `level`, `value`, `terms`, and `id`:
#'
#' \describe{
#' \item{level}{character, the factor levels}
#' \item{value}{numeric, the encoding}
#' \item{terms}{character, the selectors or variables selected}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-supervised
#'
#' @references
Expand All @@ -78,7 +85,7 @@
#' "Hierarchical Partial Pooling for Repeated Binary Trials"
#' \url{https://tinyurl.com/stan-pooling}
#'
#' "Prior Distributions for `rstanarm`` Models"
#' "Prior Distributions for `rstanarm` Models"
#' \url{https://tinyurl.com/stan-priors}
#'
#' "Estimating Generalized (Non-)Linear Models with Group-Specific Terms with
Expand Down
13 changes: 10 additions & 3 deletions R/lencode_glm.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,16 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `value` and `component` is returned.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `level`, `value`, `terms`, and `id`:
#'
#' \describe{
#' \item{level}{character, the factor levels}
#' \item{value}{numeric, the encoding}
#' \item{terms}{character, the selectors or variables selected}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-supervised
#'
#' @references
Expand Down
11 changes: 9 additions & 2 deletions R/lencode_mixed.R
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,15 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `value` and `component` is returned.
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `level`, `value`, `terms`, and `id`:
#'
#' \describe{
#' \item{level}{character, the factor levels}
#' \item{value}{numeric, the encoding}
#' \item{terms}{character, the selectors or variables selected}
#' \item{id}{character, id of this step}
#' }
#'
#' @template case-weights-supervised
#'
Expand Down
11 changes: 9 additions & 2 deletions R/pca_sparse.R
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,15 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `value` and `component` is returned.
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `value`, `component`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{numeric, variable loading}
#' \item{component}{character, principle component}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_pca_sparse"
Expand Down
11 changes: 9 additions & 2 deletions R/pca_sparse_bayes.R
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,16 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected), `value` and `component` is returned.
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms`, `value`, `component`, and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{numeric, variable loading}
#' \item{component}{character, principle component}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_pca_sparse_bayes"
#' result <- knitr::knit_child("man/rmd/tunable-args.Rmd")
Expand Down
23 changes: 20 additions & 3 deletions R/pca_truncated.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,26 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, use either `type = "coef"` for
#' the variable loadings per component or `type = "variance"` for how much
#' variance each component accounts for.
#' When you [`tidy()`][tidy.recipe()] this step two things can happen depending
#' the `type` argument. If `type = "coef"` a tibble returned with 4 columns
#' `terms`, `value`, `component` , and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{numeric, variable loading}
#' \item{component}{character, principle component}
#' \item{id}{character, id of this step}
#' }
#'
#' If `type = "variance"` a tibble returned with 4 columns `terms`, `value`,
#' `component` , and `id`:
#'
#' \describe{
#' \item{terms}{character, type of variance}
#' \item{value}{numeric, value of the variance}
#' \item{component}{integer, principle component}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_pca_truncated"
Expand Down
9 changes: 7 additions & 2 deletions R/umap.R
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,13 @@
#'
#' # Tidying
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble with columns `terms`
#' (the selectors or variables selected) is returned.
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms` and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_umap"
Expand Down
19 changes: 18 additions & 1 deletion R/woe.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#' predictors in a model.
#' @param outcome The bare name of the binary outcome encased in `vars()`.
#' @param dictionary A tbl. A map of levels and woe values. It must have the
#' same layout than the output returned from [dictionary()]. If `NULL`` the
#' same layout than the output returned from [dictionary()]. If `NULL` the
#' function will build a dictionary with those variables passed to \code{...}.
#' See [dictionary()] for details.
#' @param Laplace The Laplace smoothing parameter. A value usually applied to
Expand Down Expand Up @@ -67,6 +67,23 @@
#' `p_bad`, `p_good`, `woe` and `outcome` is returned.. See [dictionary()] for
#' more information.
#'
#' When you [`tidy()`][tidy.recipe()] this step, a tibble is retruned with
#' columns `terms` `value`, `n_tot`, `n_bad`, `n_good`, `p_bad`, `p_good`, `woe`
#' and `outcome` and `id`:
#'
#' \describe{
#' \item{terms}{character, the selectors or variables selected}
#' \item{value}{character, level of the outcome}
#' \item{n_tot}{integer, total number}
#' \item{n_bad}{integer, number of bad examples}
#' \item{n_good}{integer, number of good examples}
#' \item{p_bad}{numeric, p of bad examples}
#' \item{p_good}{numeric, p of good examples}
#' \item{woe}{numeric, weight of evidence}
#' \item{outcome}{character, name of outcome variable}
#' \item{id}{character, id of this step}
#' }
#'
#' ```{r, echo = FALSE, results="asis"}
#' step <- "step_woe"
#' result <- knitr::knit_child("man/rmd/tunable-args.Rmd")
Expand Down
13 changes: 9 additions & 4 deletions man/step_collapse_cart.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 9 additions & 3 deletions man/step_collapse_stringdist.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 8 additions & 2 deletions man/step_discretize_cart.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 8 additions & 2 deletions man/step_discretize_xgb.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit d64f7f6

Please sign in to comment.