Skip to content

Commit

Permalink
Prepare CRAN release (#1008)
Browse files Browse the repository at this point in the history
  • Loading branch information
strengejacke authored Sep 3, 2024
1 parent 7cb0ab7 commit ec8e94b
Show file tree
Hide file tree
Showing 23 changed files with 735 additions and 71 deletions.
3 changes: 1 addition & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: parameters
Title: Processing of Model Parameters
Version: 0.22.1.9
Version: 0.22.1.99
Authors@R:
c(person(given = "Daniel",
family = "Lüdecke",
Expand Down Expand Up @@ -221,4 +221,3 @@ Config/testthat/edition: 3
Config/testthat/parallel: true
Config/Needs/website: easystats/easystatstemplate
Config/rcmdcheck/ignore-inconsequential-notes: true
Remotes: easystats/insight
19 changes: 19 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,22 @@ S3method(model_parameters,zeroinfl)
S3method(model_parameters,zoo)
S3method(p_calibrate,default)
S3method(p_calibrate,numeric)
S3method(p_significance,coxph)
S3method(p_significance,feis)
S3method(p_significance,felm)
S3method(p_significance,gee)
S3method(p_significance,glm)
S3method(p_significance,glmmTMB)
S3method(p_significance,gls)
S3method(p_significance,hurdle)
S3method(p_significance,lm)
S3method(p_significance,lme)
S3method(p_significance,merMod)
S3method(p_significance,mixed)
S3method(p_significance,rma)
S3method(p_significance,svyglm)
S3method(p_significance,wbm)
S3method(p_significance,zeroinfl)
S3method(p_value,BBmm)
S3method(p_value,BBreg)
S3method(p_value,BFBayesFactor)
Expand Down Expand Up @@ -547,6 +563,7 @@ S3method(print,n_clusters_hclust)
S3method(print,n_clusters_silhouette)
S3method(print,n_factors)
S3method(print,p_calibrate)
S3method(print,p_significance_lm)
S3method(print,parameters_brms_meta)
S3method(print,parameters_clusters)
S3method(print,parameters_da)
Expand Down Expand Up @@ -914,6 +931,7 @@ export(n_factors)
export(n_parameters)
export(p_calibrate)
export(p_function)
export(p_significance)
export(p_value)
export(p_value_betwithin)
export(p_value_kenward)
Expand Down Expand Up @@ -951,6 +969,7 @@ export(supported_models)
export(visualisation_recipe)
importFrom(bayestestR,ci)
importFrom(bayestestR,equivalence_test)
importFrom(bayestestR,p_significance)
importFrom(datawizard,demean)
importFrom(datawizard,describe_distribution)
importFrom(datawizard,kurtosis)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@

## Changes

* Added `p_significance()` methods for frequentist models.

* Methods for `degrees_of_freedom()` have been removed. `degrees_of_freedom()`
now calls `insight::get_df()`.

Expand Down
92 changes: 83 additions & 9 deletions R/1_model_parameters.R
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,7 @@
#' packages or other software packages (like SPSS). To mimic behaviour of SPSS
#' or packages such as **lm.beta**, use `standardize = "basic"`.
#'
#' @section
#'
#' Standardization Methods:
#'
#' @section Standardization Methods:
#' - **refit**: This method is based on a complete model re-fit with a
#' standardized version of the data. Hence, this method is equal to
#' standardizing the variables before fitting the model. It is the "purest" and
Expand Down Expand Up @@ -280,21 +277,98 @@
#' p-values are based on the probability of direction ([`bayestestR::p_direction()`]),
#' which is converted into a p-value using [`bayestestR::pd_to_p()`].
#'
#' @section Statistical inference - how to quantify evidence:
#' There is no standardized approach to drawing conclusions based on the
#' available data and statistical models. A frequently chosen but also much
#' criticized approach is to evaluate results based on their statistical
#' significance (*Amrhein et al. 2017*).
#'
#' A more sophisticated way would be to test whether estimated effects exceed
#' the "smallest effect size of interest", to avoid even the smallest effects
#' being considered relevant simply because they are statistically significant,
#' but clinically or practically irrelevant (*Lakens et al. 2018, Lakens 2024*).
#'
#' A rather unconventional approach, which is nevertheless advocated by various
#' authors, is to interpret results from classical regression models in terms of
#' probabilities, similar to the usual approach in Bayesian statistics
#' (*Greenland et al. 2022; Rafi and Greenland 2020; Schweder 2018; Schweder and
#' Hjort 2003; Vos 2022*).
#'
#' The _parameters_ package provides several options or functions to aid
#' statistical inference. These are, for example:
#' - [`equivalence_test()`], to compute the (conditional) equivalence test for
#' frequentist models
#' - [`p_significance()`], to compute the probability of *practical significance*,
#' which can be conceptualized as a unidirectional equivalence test
#' - [`p_function()`], or _consonance function_, to compute p-values and
#' compatibility (confidence) intervals for statistical models
#' - the `pd` argument (setting `pd = TRUE`) in `model_parameters()` includes
#' a column with the *probability of direction*, i.e. the probability that a
#' parameter is strictly positive or negative. See [`bayestestR::p_direction()`]
#' for details.
#' - the `s_value` argument (setting `s_value = TRUE`) in `model_parameters()`
#' replaces the p-values with their related _S_-values (*Rafi and Greenland 2020*)
#' - finally, it is possible to generate distributions of model coefficients by
#' generating bootstrap-samples (setting `bootstrap = TRUE`) or simulating
#' draws from model coefficients using [`simulate_model()`]. These samples
#' can then be treated as "posterior samples" and used in many functions from
#' the **bayestestR** package.
#'
#' Most of the above shown options or functions derive from methods originally
#' implemented for Bayesian models (*Makowski et al. 2019*). However, assuming
#' that model assumptions are met (which means, the model fits well to the data,
#' the correct model is chosen that reflectsa the data generating process
#' (distributional model family) etc.), it seems appropriate to interpret
#' results from classical frequentist models in a "Bayesian way" (more details:
#' documentation in [`p_function()`]).
#'
#' @inheritSection format_parameters Interpretation of Interaction Terms
#' @inheritSection print.parameters_model Global Options to Customize Messages and Tables when Printing
#'
#' @references
#'
#' - Amrhein, V., Korner-Nievergelt, F., and Roth, T. (2017). The earth is
#' flat (p > 0.05): Significance thresholds and the crisis of unreplicable
#' research. PeerJ, 5, e3544. \doi{10.7717/peerj.3544}
#'
#' - Greenland S, Rafi Z, Matthews R, Higgs M. To Aid Scientific Inference,
#' Emphasize Unconditional Compatibility Descriptions of Statistics. (2022)
#' https://arxiv.org/abs/1909.08583v7 (Accessed November 10, 2022)
#'
#' - Hoffman, L. (2015). Longitudinal analysis: Modeling within-person
#' fluctuation and change. Routledge.
#' fluctuation and change. Routledge.
#'
#' - Lakens, D. (2024). Improving Your Statistical Inferences (Version v1.5.1).
#' Retrieved from https://lakens.github.io/statistical_inferences/.
#' \doi{10.5281/ZENODO.6409077}
#'
#' - Lakens, D., Scheel, A. M., and Isager, P. M. (2018). Equivalence Testing
#' for Psychological Research: A Tutorial. Advances in Methods and Practices
#' in Psychological Science, 1(2), 259–269. \doi{10.1177/2515245918770963}
#'
#' - Makowski, D., Ben-Shachar, M. S., Chen, S. H. A., and Lüdecke, D. (2019).
#' Indices of Effect Existence and Significance in the Bayesian Framework.
#' Frontiers in Psychology, 10, 2767. \doi{10.3389/fpsyg.2019.02767}
#'
#' - Neter, J., Wasserman, W., & Kutner, M. H. (1989). Applied linear
#' regression models.
#' regression models.
#'
#' - Rafi Z, Greenland S. Semantic and cognitive tools to aid statistical
#' science: replace confidence and significance by compatibility and surprise.
#' BMC Medical Research Methodology (2020) 20:244.

#' science: replace confidence and significance by compatibility and surprise.
#' BMC Medical Research Methodology (2020) 20:244.
#'
#' - Schweder T. Confidence is epistemic probability for empirical science.
#' Journal of Statistical Planning and Inference (2018) 195:116–125.
#' \doi{10.1016/j.jspi.2017.09.016}
#'
#' - Schweder T, Hjort NL. Frequentist analogues of priors and posteriors.
#' In Stigum, B. (ed.), Econometrics and the Philosophy of Economics: Theory
#' Data Confrontation in Economics, pp. 285-217. Princeton University Press,
#' Princeton, NJ, 2003
#'
#' - Vos P, Holbert D. Frequentist statistical inference without repeated sampling.
#' Synthese 200, 89 (2022). \doi{10.1007/s11229-022-03560-x}
#'
#' @return A data frame of indices related to the model's parameters.
#' @export
model_parameters <- function(model, ...) {
Expand Down
88 changes: 67 additions & 21 deletions R/equivalence_test.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,9 @@ bayestestR::equivalence_test
#' @inheritParams model_parameters.merMod
#' @inheritParams p_value
#'
#' @seealso For more details, see [bayestestR::equivalence_test()].
#' Further readings can be found in the references.
#' @seealso For more details, see [bayestestR::equivalence_test()]. Further
#' readings can be found in the references. See also [`p_significance()`] for
#' a unidirectional equivalence test.
#'
#' @details
#' In classical null hypothesis significance testing (NHST) within a frequentist
Expand Down Expand Up @@ -111,6 +112,8 @@ bayestestR::equivalence_test
#' (argument `range`). See 'Details' in [bayestestR::rope_range()]
#' for further information.
#'
#' @inheritSection model_parameters Statistical inference - how to quantify evidence
#'
#' @note There is also a [`plot()`-method](https://easystats.github.io/see/articles/parameters.html)
#' implemented in the [**see**-package](https://easystats.github.io/see/).
#'
Expand Down Expand Up @@ -339,6 +342,7 @@ equivalence_test.ggeffects <- function(x,
l <- Map(
function(ci_wide, ci_narrow) {
.equivalence_test_numeric(
ci = ci,
ci_wide,
ci_narrow,
range_rope = range,
Expand Down Expand Up @@ -432,11 +436,11 @@ equivalence_test.ggeffects <- function(x,
l <- Map(
function(ci_wide, ci_narrow) {
.equivalence_test_numeric(
ci = ci,
ci_wide,
ci_narrow,
range_rope = range,
rule = rule,
ci = ci,
verbose = verbose
)
}, conf_int, conf_int2
Expand Down Expand Up @@ -520,11 +524,11 @@ equivalence_test.ggeffects <- function(x,
l <- Map(
function(ci_wide, ci_narrow) {
.equivalence_test_numeric(
ci = ci,
ci_wide,
ci_narrow,
range_rope = range,
rule = rule,
ci = ci,
verbose = verbose
)
}, conf_int, conf_int2
Expand All @@ -542,7 +546,12 @@ equivalence_test.ggeffects <- function(x,


#' @keywords internal
.equivalence_test_numeric <- function(ci_wide, ci_narrow, range_rope, rule, ci = 0.95, verbose) {
.equivalence_test_numeric <- function(ci = 0.95,
ci_wide,
ci_narrow,
range_rope,
rule,
verbose) {
final_ci <- NULL

# ==== HDI+ROPE decision rule, by Kruschke ====
Expand Down Expand Up @@ -593,7 +602,7 @@ equivalence_test.ggeffects <- function(x,
data.frame(
CI_low = final_ci[1],
CI_high = final_ci[2],
SGPV = .rope_coverage(range_rope, final_ci),
SGPV = .rope_coverage(ci = ci, range_rope, ci_range = final_ci),
ROPE_low = range_rope[1],
ROPE_high = range_rope[2],
ROPE_Equivalence = decision,
Expand Down Expand Up @@ -645,29 +654,66 @@ equivalence_test.ggeffects <- function(x,
# same range / limits as the confidence interval, thus indeed representing a
# normally distributed confidence interval. We then calculate the probability
# mass of this interval that is inside the ROPE.
.rope_coverage <- function(range_rope, ci_range) {
diff_ci <- abs(diff(ci_range))
out <- bayestestR::distribution_normal(
n = 1000,
mean = ci_range[2] - (diff_ci / 2),
# we divide the complete range by 2, the one-directional range for the SD
# then, the range from mean value to lower/upper limit, for a normal
# distribution is approximately 3.3 SD (3 SD cover 99.7% of the probability
# mass of the normal distribution). Thus, assuming that half of the ci_range
# refers to ~ 3.3 SD, we "normalize" the value (i.e. divide by 3.290525) to
# get the value for one SD, which we need to build the normal distribution.
sd = (diff_ci / 2) / 3.290525
)

.rope_coverage <- function(ci = 0.95, range_rope, ci_range) {
out <- .generate_posterior_from_ci(ci, ci_range)
# compare: ci_range and range(out)

# The SGPV refers to the proportion of the confidence interval inside the
# full ROPE - thus, we set ci = 1 here
rc <- bayestestR::rope(out, range = range_rope, ci = 1)
rc$ROPE_Percentage
}


.generate_posterior_from_ci <- function(ci = 0.95, ci_range, precision = 10000) {
# this function creates an approximate normal distribution that covers
# the CI-range, i.e. we "simulate" a posterior distribution of a
# frequentist CI
diff_ci <- abs(diff(ci_range))
bayestestR::distribution_normal(
n = precision,
mean = ci_range[2] - (diff_ci / 2),
# we divide the complete range by 2, the one-directional range for the SD.
# then, the range from mean value to lower/upper limit, for a normal
# distribution is approximately 3.3 SD (3 SD cover 99.7% of the probability
# mass of the normal distribution, `1 - ((1 - pnorm(3)) * 2)`). Thus,
# assuming that half of the ci_range refers to ~ 3.3 SD, we "normalize" the
# value (i.e. divide by 3.3) to get the value for one SD, which we need
# to build the normal distribution. The SD itself varies by confidence level,
# therefore we have a multiplier based on the confidence level. I agree this
# looks *very* hacky, but it is tested against following code, which used
# this code to create a normal distribution with "full" coverage, based on
# the approximation of the SD related to the CI-level. From this normal-
# distribution, the CI-level % interval is drawn and the range of the
# simulated normal distribution equals the desired range.
# -------------------------------------------------------------------------
# m <- lm(mpg ~ gear + hp + wt + cyl + am, data = mtcars)
# ci <- 0.75
# mp <- model_parameters(m, ci = ci)
# ci_range <- c(mp$CI_low[2], mp$CI_high[2])
# diff_ci <- abs(diff(ci_range))
# out <- bayestestR::distribution_normal(
# n = 10000,
# mean = ci_range[2] - (diff_ci / 2),
# sd = diff_ci / ((stats::qnorm((1+ci)/2) * (stats::qnorm(0.999975) / 2)))
# )
# # these to ranges are roughly the same
# ci(out, ci = ci)
# ci_range
# -------------------------------------------------------------------------
# furthermore, using this approximation, following three approaches yield
# similar results:
# -------------------------------------------------------------------------
# m <- lm(mpg ~ gear + wt + cyl + hp, data = mtcars)
# m2 <- brm(mpg ~ gear + wt + cyl + hp, data = mtcars)
# p_significance(m, threshold = 0.6) # the default for "mpg" as response
# p_significance(m2)
# p_significance(simulate_model(m))
# -------------------------------------------------------------------------
sd = diff_ci / ((stats::qnorm((1 + ci) / 2) * (stats::qnorm(0.999975) / 2)))
)
}


.add_p_to_equitest <- function(model, ci, range) {
tryCatch(
{
Expand Down
Loading

0 comments on commit ec8e94b

Please sign in to comment.