Skip to content

Commit

Permalink
more accuracy for p_sig/p_dir (#1012)
Browse files Browse the repository at this point in the history
* more accuracy for p_sig/p_dir

* use bayestestR dev

* fix test

* vignette

* fix test
  • Loading branch information
strengejacke authored Sep 12, 2024
1 parent 29baa97 commit 37509c2
Show file tree
Hide file tree
Showing 13 changed files with 188 additions and 167 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ Suggests:
coxme,
cplm,
dbscan,
distributional,
domir (>= 0.2.0),
drc,
DRR,
Expand Down Expand Up @@ -221,4 +222,4 @@ Config/testthat/edition: 3
Config/testthat/parallel: true
Config/Needs/website: easystats/easystatstemplate
Config/rcmdcheck/ignore-inconsequential-notes: true
Remotes: easystats/insight
Remotes: easystats/insight, easystats/bayestestR
98 changes: 64 additions & 34 deletions R/equivalence_test.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@ bayestestR::equivalence_test
#'
#' @param x A statistical model.
#' @param range The range of practical equivalence of an effect. May be
#' `"default"`, to automatically define this range based on properties of the
#' model's data.
#' `"default"`, to automatically define this range based on properties of the
#' model's data.
#' @param ci Confidence Interval (CI) level. Default to `0.95` (`95%`).
#' @param rule Character, indicating the rules when testing for practical
#' equivalence. Can be `"bayes"`, `"classic"` or `"cet"`. See
#' 'Details'.
#' equivalence. Can be `"bayes"`, `"classic"` or `"cet"`. See 'Details'.
#' @param test Hypothesis test for computing contrasts or pairwise comparisons.
#' See [`?ggeffects::test_predictions`](https://strengejacke.github.io/ggeffects/reference/test_predictions.html)
#' for details.
#' See [`?ggeffects::test_predictions`](https://strengejacke.github.io/ggeffects/reference/test_predictions.html)
#' for details.
#' @param verbose Toggle warnings and messages.
#' @param ... Arguments passed to or from other methods.
#' @inheritParams model_parameters.merMod
Expand All @@ -28,18 +27,17 @@ bayestestR::equivalence_test
#' readings can be found in the references. See also [`p_significance()`] for
#' a unidirectional equivalence test.
#'
#' @details
#' In classical null hypothesis significance testing (NHST) within a frequentist
#' framework, it is not possible to accept the null hypothesis, H0 - unlike
#' in Bayesian statistics, where such probability statements are possible.
#' "[...] one can only reject the null hypothesis if the test
#' @details In classical null hypothesis significance testing (NHST) within a
#' frequentist framework, it is not possible to accept the null hypothesis, H0 -
#' unlike in Bayesian statistics, where such probability statements are
#' possible. "[...] one can only reject the null hypothesis if the test
#' statistics falls into the critical region(s), or fail to reject this
#' hypothesis. In the latter case, all we can say is that no significant effect
#' was observed, but one cannot conclude that the null hypothesis is true."
#' (_Pernet 2017_). One way to address this issues without Bayesian methods
#' is *Equivalence Testing*, as implemented in `equivalence_test()`.
#' While you either can reject the null hypothesis or claim an inconclusive result
#' in NHST, the equivalence test - according to _Pernet_ - adds a third category,
#' (_Pernet 2017_). One way to address this issues without Bayesian methods is
#' *Equivalence Testing*, as implemented in `equivalence_test()`. While you
#' either can reject the null hypothesis or claim an inconclusive result in
#' NHST, the equivalence test - according to _Pernet_ - adds a third category,
#' *"accept"*. Roughly speaking, the idea behind equivalence testing in a
#' frequentist framework is to check whether an estimate and its uncertainty
#' (i.e. confidence interval) falls within a region of "practical equivalence".
Expand Down Expand Up @@ -95,25 +93,25 @@ bayestestR::equivalence_test
#' ## p-Values
#' The equivalence p-value is the area of the (cumulative) confidence
#' distribution that is outside of the region of equivalence. It can be
#' interpreted as p-value for *rejecting* the alternative hypothesis
#' and *accepting* the "null hypothesis" (i.e. assuming practical
#' equivalence). That is, a high p-value means we reject the assumption of
#' practical equivalence and accept the alternative hypothesis.
#' interpreted as p-value for *rejecting* the alternative hypothesis and
#' *accepting* the "null hypothesis" (i.e. assuming practical equivalence). That
#' is, a high p-value means we reject the assumption of practical equivalence
#' and accept the alternative hypothesis.
#'
#' ## Second Generation p-Value (SGPV)
#' Second generation p-values (SGPV) were proposed as a statistic that
#' represents _the proportion of data-supported hypotheses that are also null
#' hypotheses_ _(Blume et al. 2018, Lakens and Delacre 2020)_. It represents the
#' proportion of the _full_ confidence interval range (assuming a normally
#' distributed, equal-tailed interval) that is inside the ROPE. The SGPV ranges
#' from zero to one. Higher values indicate that the effect is more likely to be
#' practically equivalent ("not of interest").
#' proportion of the _full_ confidence interval range (assuming a normally or
#' t-distributed, equal-tailed interval, based on the model) that is inside the
#' ROPE. The SGPV ranges from zero to one. Higher values indicate that the
#' effect is more likely to be practically equivalent ("not of interest").
#'
#' Note that the assumed interval, which is used to calculate the SGPV, is an
#' estimation of the _full interval_ based on the chosen confidence level. For
#' example, if the 95% confidence interval of a coefficient ranges from -1 to 1,
#' the underlying _full (normally distributed) interval_ approximately ranges
#' from -1.9 to 1.9, see also following code:
#' the underlying _full (normally or t-distributed) interval_ approximately
#' ranges from -1.9 to 1.9, see also following code:
#'
#' ```
#' # simulate full normal distribution
Expand Down Expand Up @@ -390,6 +388,7 @@ equivalence_test.ggeffects <- function(x,
focal <- attributes(x)$original.terms
obj_name <- attributes(x)$model.name
ci <- attributes(x)$ci.lvl
dof <- attributes(x)$df

x <- .get_ggeffects_model(x)

Expand Down Expand Up @@ -419,6 +418,7 @@ equivalence_test.ggeffects <- function(x,
ci_narrow,
range_rope = range,
rule = rule,
dof = dof,
verbose = verbose
)
}, conf_int, conf_int2
Expand Down Expand Up @@ -490,6 +490,18 @@ equivalence_test.ggeffects <- function(x,
}


# ==== check degrees of freedom ====

df_column <- grep("(df|df_error)", colnames(x))
if (length(df_column) > 0) {
dof <- unique(x[[df_column]])
if (length(dof) > 1) {
dof <- Inf
}
} else {
dof <- Inf
}

# ==== requested confidence intervals ====

params <- conf_int <- .ci_generic(x, ci = ci)
Expand All @@ -513,6 +525,7 @@ equivalence_test.ggeffects <- function(x,
ci_narrow,
range_rope = range,
rule = rule,
dof = dof,
verbose = verbose
)
}, conf_int, conf_int2
Expand Down Expand Up @@ -623,6 +636,7 @@ equivalence_test.ggeffects <- function(x,
ci_narrow,
range_rope,
rule,
dof = Inf,
verbose) {
final_ci <- NULL

Expand Down Expand Up @@ -674,7 +688,7 @@ equivalence_test.ggeffects <- function(x,
data.frame(
CI_low = final_ci[1],
CI_high = final_ci[2],
SGPV = .rope_coverage(ci = ci, range_rope, ci_range = final_ci),
SGPV = .rope_coverage(ci = ci, range_rope, ci_range = final_ci, dof = dof),
ROPE_low = range_rope[1],
ROPE_high = range_rope[2],
ROPE_Equivalence = decision,
Expand Down Expand Up @@ -726,8 +740,8 @@ equivalence_test.ggeffects <- function(x,
# same range / limits as the confidence interval, thus indeed representing a
# normally distributed confidence interval. We then calculate the probability
# mass of this interval that is inside the ROPE.
.rope_coverage <- function(ci = 0.95, range_rope, ci_range) {
out <- .generate_posterior_from_ci(ci, ci_range)
.rope_coverage <- function(ci = 0.95, range_rope, ci_range, dof = Inf) {
out <- .generate_posterior_from_ci(ci, ci_range, dof = dof)
# compare: ci_range and range(out)
# The SGPV refers to the proportion of the confidence interval inside the
# full ROPE - thus, we set ci = 1 here
Expand All @@ -736,23 +750,39 @@ equivalence_test.ggeffects <- function(x,
}


.generate_posterior_from_ci <- function(ci = 0.95, ci_range, precision = 10000) {
.generate_posterior_from_ci <- function(ci = 0.95, ci_range, dof = Inf, precision = 10000) {
# this function creates an approximate normal distribution that covers the
# CI-range, i.e. we "simulate" a posterior distribution from a frequentist CI

# first we need the z-values of thq quantiles from the CI range
z_value <- stats::qnorm((1 + ci) / 2)
# then we need the range of the CI (in units), also to calculate the mean of
# sanity check - dof argument
if (is.null(dof)) {
dof <- Inf
}
# first we need the range of the CI (in units), also to calculate the mean of
# the normal distribution
diff_ci <- abs(diff(ci_range))
mean_dist <- ci_range[2] - (diff_ci / 2)
# then we need the critical values of the quantiles from the CI range
z_value <- stats::qt((1 + ci) / 2, df = dof)
# the range of Z-scores (from lower to upper quantile) gives us the range of
# the provided interval in terms of standard deviations. now we divide the
# known range of the provided CI (in units) by the z-score-range, which will
# give us the standard deviation of the distribution.
sd_dist <- diff_ci / diff(c(-1 * z_value, z_value))
# we now know all parameters (mean and sd) to simulate a normal distribution
bayestestR::distribution_normal(n = precision, mean = mean_dist, sd = sd_dist)
# generate normal-distribution if we don't have t-distribution, or if
# we don't have necessary packages installed
if (is.infinite(dof) || !insight::check_if_installed("distributional", quietly = TRUE)) {
# tell user to install "distributional"
if (!is.infinite(dof)) {
insight::format_alert("For models with only few degrees of freedom, install the {distributional} package to increase accuracy of `p_direction()`, `p_significance()` and `equivalence_test()`.") # nolint
}
# we now know all parameters (mean and sd) to simulate a normal distribution
bayestestR::distribution_normal(n = precision, mean = mean_dist, sd = sd_dist)
} else {
insight::check_if_installed("distributional")
out <- distributional::dist_student_t(df = dof, mu = mean_dist, sigma = sd_dist)
sort(unlist(distributional::generate(out, times = precision), use.names = FALSE))
}
}


Expand Down
26 changes: 14 additions & 12 deletions R/p_significance.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,21 +35,22 @@ bayestestR::p_significance
#' for functions related to checking effect existence and significance.
#'
#' @details `p_significance()` returns the proportion of the _full_ confidence
#' interval range (assuming a normally distributed, equal-tailed interval) that
#' is outside a certain range (the negligible effect, or ROPE, see argument
#' `threshold`). If there are values of the distribution both below and above
#' the ROPE, `p_significance()` returns the higher probability of a value being
#' outside the ROPE. Typically, this value should be larger than 0.5 to indicate
#' practical significance. However, if the range of the negligible effect is
#' rather large compared to the range of the confidence interval,
#' `p_significance()` will be less than 0.5, which indicates no clear practical
#' significance.
#' interval range (assuming a normally or t-distributed, equal-tailed interval,
#' based on the model) that is outside a certain range (the negligible effect,
#' or ROPE, see argument `threshold`). If there are values of the distribution
#' both below and above the ROPE, `p_significance()` returns the higher
#' probability of a value being outside the ROPE. Typically, this value should
#' be larger than 0.5 to indicate practical significance. However, if the range
#' of the negligible effect is rather large compared to the range of the
#' confidence interval, `p_significance()` will be less than 0.5, which
#' indicates no clear practical significance.
#'
#' Note that the assumed interval, which is used to calculate the practical
#' significance, is an estimation of the _full interval_ based on the chosen
#' confidence level. For example, if the 95% confidence interval of a
#' coefficient ranges from -1 to 1, the underlying _full (normally distributed)
#' interval_ approximately ranges from -1.9 to 1.9, see also following code:
#' coefficient ranges from -1 to 1, the underlying _full (normally or
#' t-distributed) interval_ approximately ranges from -1.9 to 1.9, see also
#' following code:
#'
#' ```
#' # simulate full normal distribution
Expand Down Expand Up @@ -178,11 +179,12 @@ p_significance.lm <- function(x, threshold = "default", ci = 0.95, verbose = TRU
.posterior_ci <- function(x, ci, ...) {
# first, we need CIs
out <- ci(x, ci = ci, ...)
dof <- .safe(insight::get_df(x, type = "wald"), Inf)
# we now iterate all confidence intervals and create an approximate normal
# distribution that covers the CI-range.
posterior <- as.data.frame(lapply(seq_len(nrow(out)), function(i) {
ci_range <- as.numeric(out[i, c("CI_low", "CI_high")])
.generate_posterior_from_ci(ci, ci_range)
.generate_posterior_from_ci(ci, ci_range, dof = dof)
}))
colnames(posterior) <- out$Parameter

Expand Down
39 changes: 19 additions & 20 deletions man/equivalence_test.lm.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/p_direction.lm.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 37509c2

Please sign in to comment.