Merge pull request #521 from brshallo/document-383

use `fun()` instead of `fun` across docs, fixes #383
tidymodels · Sep 4, 2024 · 7ab9cff · 7ab9cff
2 parents 3b15c34 + 8fdd408
commit 7ab9cff
Show file tree

Hide file tree

Showing 26 changed files with 61 additions and 63 deletions.
diff --git a/NEWS.md b/NEWS.md
@@ -12,6 +12,8 @@
 
 * Formatting improvement: package names are now not in backticks anymore (@agmurray, #525).
 
+* Improved documentation and formatting: function names are now more easily identifiable through either `()` at the end or being links to the function documentation (@brshallo , #521).
+
 ## Bug fixes
 
 * `vfold_cv()` now utilizes the `breaks` argument correctly for repeated cross-validation (@ZWael, #471).

diff --git a/R/boot.R b/R/boot.R
@@ -17,7 +17,7 @@
 #' @param times The number of bootstrap samples.
 #' @param apparent A logical. Should an extra resample be added where the
 #'  analysis and holdout subset are the entire data set. This is required for
-#'  some estimators used by the `summary` function that require the apparent
+#'  some estimators used by the [summary()] function that require the apparent
 #'  error rate.
 #' @export
 #' @return A tibble with classes `bootstraps`, `rset`, `tbl_df`, `tbl`, and

diff --git a/R/caret.R b/R/caret.R
@@ -4,10 +4,10 @@
 #'  \pkg{rsample} and \pkg{caret}.
 #'
 #' @param object An `rset` object. Currently,
-#'  `nested_cv` is not supported.
-#' @return `rsample2caret` returns a list that mimics the
+#'  [nested_cv()] is not supported.
+#' @return `rsample2caret()` returns a list that mimics the
 #'  `index` and `indexOut` elements of a
-#'  `trainControl` object. `caret2rsample` returns an
+#'  `trainControl` object. `caret2rsample()` returns an
 #'  `rset` object of the appropriate class.
 #' @export
 rsample2caret <- function(object, data = c("analysis", "assessment")) {
@@ -23,7 +23,7 @@ rsample2caret <- function(object, data = c("analysis", "assessment")) {
 }
 
 #' @rdname rsample2caret
-#' @param ctrl An object produced by `trainControl` that has
+#' @param ctrl An object produced by `caret::trainControl()` that has
 #'  had the `index` and `indexOut` elements populated by
 #'  integers. One method of getting this is to extract the
 #'  `control` objects from an object produced by `train`.

diff --git a/R/form_pred.R b/R/form_pred.R
@@ -1,6 +1,6 @@
 #' Extract Predictor Names from Formula or Terms
 #'
-#' `all.vars` returns all variables used in a formula. This
+#' While [all.vars()] returns all variables used in a formula, this
 #'  function only returns the variables explicitly used on the
 #'  right-hand side (i.e., it will not resolve dots unless the
 #'  object is terms with a data set specified).

diff --git a/R/labels.R b/R/labels.R
@@ -1,10 +1,9 @@
 #' Find Labels from rset Object
 #'
 #' Produce a vector of resampling labels (e.g. "Fold1") from
-#'  an `rset` object. Currently, `nested_cv`
-#'  is not supported.
+#' an `rset` object. Currently, [nested_cv()] is not supported.
 #'
-#' @param object An `rset` object
+#' @param object An `rset` object.
 #' @param make_factor A logical for whether the results should be
 #'  a character or a factor.
 #' @param ... Not currently used.
@@ -68,7 +67,7 @@ labels.rsplit <- function(object, ...) {
 #' For a data set, `add_resample_id()` will add at least one new column that
 #'  identifies which resample that the data came from. In most cases, a single
 #'  column is added but for some resampling methods, two or more are added.
-#' @param .data A data frame
+#' @param .data A data frame.
 #' @param split A single `rset` object.
 #' @param dots A single logical: should the id columns be prefixed with a "."
 #'  to avoid name conflicts with `.data`?

diff --git a/R/make_groups.R b/R/make_groups.R
@@ -25,7 +25,7 @@
 #'  only one) assessment set, but rather allow each observation to be in an
 #'  assessment set zero-or-more times. As a result, those functions don't have
 #'  a `balance` argument, and under the hood always specify `balance = "prop"`
-#'  when they call [make_groups()].
+#'  when they call `make_groups()`.
 #'
 #' @keywords internal
 make_groups <- function(data,

diff --git a/R/nest.R b/R/nest.R
@@ -1,6 +1,6 @@
 #' Nested or Double Resampling
 #'
-#' `nested_cv` can be used to take the results of one resampling procedure
+#' `nested_cv()` can be used to take the results of one resampling procedure
 #'   and conduct further resamples within each split. Any type of resampling
 #'   used in rsample can be used.
 #'

diff --git a/R/permutations.R b/R/permutations.R
@@ -5,7 +5,7 @@
 #'   by permuting/shuffling one or more columns. This results in analysis
 #'   samples where some columns are in their original order and some columns
 #'   are permuted to a random order. Unlike other sampling functions in
-#'   rsample, there is no assessment set and calling `assessment()` on a
+#'   rsample, there is no assessment set and calling [assessment()] on a
 #'   permutation split will throw an error.
 #'
 #' @param data A data frame.

diff --git a/R/printing.R b/R/printing.R
@@ -1,4 +1,4 @@
-## The `pretty` methods below are good for when you need to
+## The `pretty()` methods below are good for when you need to
 ## textually describe the resampling procedure. Note that they
 ## can have more than one element (in the case of nesting)
 

diff --git a/R/reg_intervals.R b/R/reg_intervals.R
@@ -2,18 +2,18 @@
 #'
 #' @param formula An R model formula with one outcome and at least one predictor.
 #' @param data A data frame.
-#' @param model_fn The model to fit. Allowable values are "lm", "glm",
-#'  "survreg", and "coxph". The latter two require that the `survival` package
+#' @param model_fn The model to fit. Allowable values are `"lm"`, `"glm"`,
+#'  `"survreg"`, and `"coxph"`. The latter two require that the survival package
 #'  be installed.
-#' @param type The type of bootstrap confidence interval. Values of "student-t" and
-#' "percentile" are allowed.
+#' @param type The type of bootstrap confidence interval. Values of `"student-t"` and
+#' `"percentile"` are allowed.
 #' @param times A single integer for the number of bootstrap samples. If left
-#' NULL, 1,001 are used for t-intervals and 2,001 for percentile intervals.
+#' `NULL`, 1,001 are used for t-intervals and 2,001 for percentile intervals.
 #' @param alpha Level of significance.
 #' @param filter A logical expression used to remove rows from the final result, or `NULL` to keep all rows.
 #' @param keep_reps Should the individual parameter estimates for each bootstrap
 #' sample be retained?
-#' @param ... Options to pass to the model function (such as `family` for `glm()`).
+#' @param ... Options to pass to the model function (such as `family` for [stats::glm()]).
 #' @return A tibble with columns "term", ".lower", ".estimate", ".upper",
 #' ".alpha", and ".method". If `keep_reps = TRUE`, an additional list column
 #' called ".replicates" is also returned.

diff --git a/R/rsplit.R b/R/rsplit.R
@@ -66,12 +66,12 @@ as.integer.rsplit <-
 #'
 #' The analysis or assessment code can be returned as a data
 #'   frame (as dictated by the `data` argument) using
-#'   `as.data.frame.rsplit`. `analysis` and
-#'   `assessment` are shortcuts.
+#'   `as.data.frame.rsplit()`. `analysis()` and
+#'   `assessment()` are shortcuts.
 #' @param x An `rsplit` object.
 #' @param row.names `NULL` or a character vector giving the row names for the data frame. Missing values are not allowed.
 #' @param optional A logical: should the column names of the data be checked for legality?
-#' @param data Either "analysis" or "assessment" to specify which data are returned.
+#' @param data Either `"analysis"` or `"assessment"` to specify which data are returned.
 #' @param ... Not currently used.
 #' @examples
 #' library(dplyr)

diff --git a/R/tidy.R b/R/tidy.R
@@ -1,19 +1,19 @@
 #' Tidy Resampling Object
 #'
-#' The `tidy` function from the \pkg{broom} package can be used on `rset` and
+#' The `tidy()` function from the \pkg{broom} package can be used on `rset` and
 #'  `rsplit` objects to generate tibbles with which rows are in the analysis and
 #'  assessment sets.
-#' @param x A  `rset` or  `rsplit` object
+#' @param x A `rset` or `rsplit` object
 #' @param unique_ind Should unique row identifiers be returned? For example,
 #'  if `FALSE` then bootstrapping results will include multiple rows in the
 #'  sample for the same row in the original data.
 #' @inheritParams rlang::args_dots_empty
 #' @return A tibble with columns `Row` and `Data`. The latter has possible
-#'  values "Analysis" or "Assessment". For `rset` inputs, identification columns
-#'  are also returned but their names and values depend on the type of
-#'  resampling. `vfold_cv` contains a column "Fold" and, if repeats are used,
-#'  another called "Repeats". `bootstraps` and `mc_cv` use the column
-#'  "Resample".
+#'   values "Analysis" or "Assessment". For `rset` inputs, identification
+#'   columns are also returned but their names and values depend on the type of
+#'   resampling. For [vfold_cv()], contains a column "Fold" and, if repeats are
+#'   used, another called "Repeats". [bootstraps()] and [mc_cv()] use the column
+#'   "Resample".
 #' @details Note that for nested resampling, the rows of the inner resample,
 #'  named `inner_Row`, are *relative* row indices and do not correspond to the
 #'  rows in the original data set.

diff --git a/man/add_resample_id.Rd b/man/add_resample_id.Rd
diff --git a/man/as.data.frame.rsplit.Rd b/man/as.data.frame.rsplit.Rd
diff --git a/man/bootstraps.Rd b/man/bootstraps.Rd
diff --git a/man/form_pred.Rd b/man/form_pred.Rd
diff --git a/man/group_bootstraps.Rd b/man/group_bootstraps.Rd
diff --git a/man/labels.rset.Rd b/man/labels.rset.Rd
diff --git a/man/make_groups.Rd b/man/make_groups.Rd
diff --git a/man/make_strata.Rd b/man/make_strata.Rd
diff --git a/man/nested_cv.Rd b/man/nested_cv.Rd
diff --git a/man/permutations.Rd b/man/permutations.Rd
diff --git a/man/reg_intervals.Rd b/man/reg_intervals.Rd
diff --git a/man/rsample2caret.Rd b/man/rsample2caret.Rd
diff --git a/man/tidy.rsplit.Rd b/man/tidy.rsplit.Rd
diff --git a/vignettes/Working_with_rsets.Rmd b/vignettes/Working_with_rsets.Rmd
@@ -109,7 +109,7 @@ example[1:10, setdiff(names(example), names(attrition))]
 
 For this model, the `.fitted` value is the linear predictor in log-odds units. 
 
-To compute this data set for each of the 100 resamples, we'll use the `map` function from the purrr package:
+To compute this data set for each of the 100 resamples, we'll use the `map()` function from the purrr package:
 
 ```{r model_purrr, warning=FALSE}
 library(purrr)
@@ -182,8 +182,7 @@ The calculated 95% confidence interval contains zero, so we don't have evidence
 
 ## Bootstrap Estimates of Model Coefficients
 
-Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [broom package](https://cran.r-project.org/package=broom) package has a `tidy` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map()` can be used to estimate and save these values for each split. 
-
+Unless there is already a column in the resample object that contains the fitted model, a function can be used to fit the model and save all of the model coefficients. The [broom package](https://cran.r-project.org/package=broom) package has a `tidy()` function that will save the coefficients in a data frame. Instead of returning a data frame with a row for each model term, we will save a data frame with a single row and columns for each model term. As before, `purrr::map()` can be used to estimate and save these values for each split.
 
 ```{r coefs}
 glm_coefs <- function(splits, ...) {