diff --git a/docs/articles/posterior.html b/docs/articles/posterior.html index 0e0439f75..24f6b6a34 100644 --- a/docs/articles/posterior.html +++ b/docs/articles/posterior.html @@ -144,72 +144,74 @@
We can easily customise the summary statistics reported by
$summary()
and $print()
.
fit <- cmdstanr::cmdstanr_example("schools", method = "sample")
fit$summary()
Warning: 130 of 4000 (3.0%) transitions ended with a divergence.
+Warning: 302 of 4000 (8.0%) transitions ended with a divergence.
See https://mc-stan.org/misc/warnings for details.
- variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-1 lp__ -58.9 -59.2 5.0 5.1 -66.97 -50 1 224 84
-2 mu 6.6 6.7 4.3 4.3 -0.55 14 1 394 115
-3 tau 5.8 5.0 3.7 3.4 1.33 13 1 223 92
-4 theta[1] 9.7 9.0 7.3 6.3 -0.99 23 1 1066 2034
-5 theta[2] 7.0 6.8 5.9 5.6 -2.54 16 1 900 2321
-6 theta[3] 5.5 5.8 6.9 6.1 -6.25 16 1 841 2201
-7 theta[4] 6.8 6.9 6.2 6.0 -3.13 17 1 781 2193
-8 theta[5] 4.6 5.0 6.0 5.8 -5.63 14 1 513 940
-9 theta[6] 5.5 5.7 6.3 5.6 -5.61 15 1 782 1784
-10 theta[7] 9.5 9.1 6.2 5.8 0.14 20 1 882 2164
-11 theta[8] 7.1 7.0 7.3 6.3 -4.80 18 1 976 2151
+Warning: 1 of 4 chains had an E-BFMI less than 0.2.
+See https://mc-stan.org/misc/warnings for details.
+ variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
+1 lp__ -56.7 -57.1 6.1 6.4 -66.008 -46 1.1 33 19
+2 mu 6.6 6.7 4.2 4.5 0.025 13 1.0 124 879
+3 tau 4.7 3.9 3.6 3.4 0.790 12 1.1 33 22
+4 theta[1] 9.1 8.5 6.8 6.0 -0.082 21 1.0 163 405
+5 theta[2] 7.0 6.9 5.5 5.8 -1.456 16 1.0 236 1810
+6 theta[3] 5.7 6.1 6.4 6.2 -4.835 16 1.0 313 1466
+7 theta[4] 6.8 6.9 5.9 5.8 -2.355 16 1.0 246 1356
+8 theta[5] 4.9 5.2 5.8 5.5 -5.038 13 1.0 219 932
+9 theta[6] 5.7 5.9 5.8 5.8 -4.073 15 1.0 278 1208
+10 theta[7] 8.9 8.6 6.0 5.7 0.094 19 1.0 176 352
+11 theta[8] 7.0 7.1 6.5 5.9 -3.088 18 1.0 317 1709
By default all variables are summaries with the follow functions:
-
+
posterior::default_summary_measures()
[1] "mean" "median" "sd" "mad" "quantile2"
To change the variables summarised, we use the variables argument
-
+
fit$summary(variables = c("mu", "tau"))
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
-1 mu 6.6 6.7 4.3 4.3 -0.55 14 1 394 115
-2 tau 5.8 5.0 3.7 3.4 1.33 13 1 223 92
+1 mu 6.6 6.7 4.2 4.5 0.025 13 1.0 124 879
+2 tau 4.7 3.9 3.6 3.4 0.790 12 1.1 33 22
We can additionally change which functions are used
-
+
fit$summary(variables = c("mu", "tau"), mean, sd)
variable mean sd
-1 mu 6.6 4.3
-2 tau 5.8 3.7
+1 mu 6.6 4.2
+2 tau 4.7 3.6
To summarise all variables with non-default functions, it is
necessary to set explicitly set the variables argument, either to
NULL
or the full vector of variable names.
-
+
fit$metadata()$model_params
fit$summary(variables = NULL, "mean", "median")
[1] "lp__" "mu" "tau" "theta[1]" "theta[2]" "theta[3]"
[7] "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
variable mean median
-1 lp__ -58.9 -59.2
+1 lp__ -56.7 -57.1
2 mu 6.6 6.7
-3 tau 5.8 5.0
-4 theta[1] 9.7 9.0
-5 theta[2] 7.0 6.8
-6 theta[3] 5.5 5.8
+3 tau 4.7 3.9
+4 theta[1] 9.1 8.5
+5 theta[2] 7.0 6.9
+6 theta[3] 5.7 6.1
7 theta[4] 6.8 6.9
-8 theta[5] 4.6 5.0
-9 theta[6] 5.5 5.7
-10 theta[7] 9.5 9.1
-11 theta[8] 7.1 7.0
+8 theta[5] 4.9 5.2
+9 theta[6] 5.7 5.9
+10 theta[7] 8.9 8.6
+11 theta[8] 7.0 7.1
Summary functions can be specified by character string, function, or
using a formula (or anything else supported by [rlang::as_function]). If
these arguments are named, those names will be used in the tibble
output. If the summary results are named they will take precedence.
-
-my_sd <- function(x) c(My_SD = sd(x))
+
+my_sd <- function(x) c(My_SD = sd(x))
fit$summary(
c("mu", "tau"),
MEAN = mean,
@@ -218,45 +220,45 @@ Summary
~quantile(.x, probs = c(0.1, 0.9)),
Minimum = function(x) min(x)
)
- variable MEAN median My_SD 10% 90% Minimum
-1 mu 6.6 6.7 4.3 0.98 12 -11.7
-2 tau 5.8 5.0 3.7 1.81 11 0.9
+ variable MEAN median My_SD 10% 90% Minimum
+1 mu 6.6 6.7 4.2 1.3 11.7 -11.23
+2 tau 4.7 3.9 3.6 1.1 9.6 0.53
Arguments to all summary functions can also be specified with
.args
.
-
+
- variable 2.5% 5% 95% 97.5%
-1 mu -2.0 -0.55 14 15
-2 tau 1.1 1.33 13 15
+ variable 2.5% 5% 95% 97.5%
+1 mu -1.17 0.025 13 15
+2 tau 0.59 0.790 12 13
The summary functions are applied to the array of sample values, with
dimension iter_sampling
xchains
.
-
+
fit$summary(variables = NULL, dim, colMeans)
variable dim.1 dim.2 1 2 3 4
-1 lp__ 1000 4 -58.8 -58.4 -59.0 -59.4
-2 mu 1000 4 6.8 6.7 6.6 6.1
-3 tau 1000 4 5.7 5.6 5.7 6.1
-4 theta[1] 1000 4 9.9 9.5 9.8 9.5
-5 theta[2] 1000 4 7.4 7.2 7.0 6.3
-6 theta[3] 1000 4 5.8 5.7 5.6 4.8
-7 theta[4] 1000 4 6.9 6.7 7.0 6.7
-8 theta[5] 1000 4 4.9 4.8 4.6 4.1
-9 theta[6] 1000 4 5.7 5.8 5.6 4.8
-10 theta[7] 1000 4 9.6 9.8 9.4 9.2
-11 theta[8] 1000 4 7.0 7.3 7.0 7.0
+1 lp__ 1000 4 -58.0 -55.7 -55.7 -57.3
+2 mu 1000 4 6.9 7.5 5.4 6.8
+3 tau 1000 4 5.2 4.3 4.4 4.9
+4 theta[1] 1000 4 10.0 9.7 7.6 9.1
+5 theta[2] 1000 4 7.1 8.0 5.8 7.2
+6 theta[3] 1000 4 5.7 6.6 4.5 5.9
+7 theta[4] 1000 4 7.2 7.7 5.6 6.7
+8 theta[5] 1000 4 4.9 6.0 4.0 4.9
+9 theta[6] 1000 4 5.7 6.7 4.8 5.7
+10 theta[7] 1000 4 9.3 9.5 7.5 9.2
+11 theta[8] 1000 4 7.0 8.0 5.9 7.0
For this reason users may have unexpected results if they use
stats::var()
directly, as it will return a covariance
matrix. An alternative is the distributional::variance()
function, which can also be accessed via
posterior::variance()
.
-
+
variable posterior::variance ~var(as.vector(.x))
-1 mu 19 19
-2 tau 14 14
+1 mu 18 18
+2 tau 13 13
Summary functions need not be numeric, but these won’t work with
$print()
.
-
+
strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
fit$summary(variables = NULL, "Strictly Positive" = strict_pos)
# fit$print(variables = NULL, "Strictly Positive" = strict_pos)
@@ -274,6 +276,66 @@ Summary
11 theta[8] no
For more information, see posterior::summarise_draws()
,
which is called by $summary()
.
+
+
+Extracting posterior draws/samples
+
+The $draws()
+method can be used to extract the posterior draws in formats provided by
+the posterior
+package. Here we demonstrate only the draws_array
and
+draws_df
formats, but the posterior
+package supports other useful formats as well.
+
+# default is a 3-D draws_array object from the posterior package
+# iterations x chains x variables
+draws_arr <- fit$draws() # or format="array"
+str(draws_arr)
+ 'draws_array' num [1:1000, 1:4, 1:11] -66.1 -68.2 -67.1 -62.4 -65.6 ...
+ - attr(*, "dimnames")=List of 3
+ ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
+ ..$ chain : chr [1:4] "1" "2" "3" "4"
+ ..$ variable : chr [1:11] "lp__" "mu" "tau" "theta[1]" ...
+
+# draws x variables data frame
+draws_df <- fit$draws(format = "df")
+str(draws_df)
+draws_df [4,000 × 14] (S3: draws_df/draws/tbl_df/tbl/data.frame)
+ $ lp__ : num [1:4000] -66.1 -68.2 -67.1 -62.4 -65.6 ...
+ $ mu : num [1:4000] -2.42 9.44 2.99 2.91 6.73 ...
+ $ tau : num [1:4000] 12.21 6.46 17.66 8.04 8.8 ...
+ $ theta[1] : num [1:4000] 5.57 11.03 -2.77 1.5 8.91 ...
+ $ theta[2] : num [1:4000] 6.97 3.31 6.77 12.84 5.79 ...
+ $ theta[3] : num [1:4000] 8.21 15.21 -8.08 -5.34 -19.54 ...
+ $ theta[4] : num [1:4000] 19.75 19.47 -7.42 -5.76 7.54 ...
+ $ theta[5] : num [1:4000] -4.12 -5.77 6.01 5.63 -3.23 ...
+ $ theta[6] : num [1:4000] -4.03 2.55 2.99 2.86 15.21 ...
+ $ theta[7] : num [1:4000] -0.186 -2.004 10.11 7.803 14.427 ...
+ $ theta[8] : num [1:4000] 0.0702 -3.005 11.0116 14.5279 14.1928 ...
+ $ .chain : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
+ $ .iteration: int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
+ $ .draw : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
+
+print(draws_df)
+# A draws_df: 1000 iterations, 4 chains, and 11 variables
+ lp__ mu tau theta[1] theta[2] theta[3] theta[4] theta[5]
+1 -66 -2.4 12.2 5.6 6.97 8.2 19.75 -4.12
+2 -68 9.4 6.5 11.0 3.31 15.2 19.47 -5.77
+3 -67 3.0 17.7 -2.8 6.77 -8.1 -7.42 6.01
+4 -62 2.9 8.0 1.5 12.84 -5.3 -5.76 5.63
+5 -66 6.7 8.8 8.9 5.79 -19.5 7.54 -3.23
+6 -64 5.3 11.4 18.5 13.37 -1.4 15.97 -0.61
+7 -60 7.3 9.1 8.6 7.82 3.5 0.34 -2.00
+8 -60 6.3 8.5 7.7 0.51 7.5 -0.99 1.51
+9 -59 1.9 6.9 3.0 8.63 1.4 3.70 4.72
+10 -63 9.1 9.3 16.0 5.77 3.9 4.14 -10.34
+# ... with 3990 more draws, and 3 more variables
+# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
+To convert an existing draws object to a different format use the
+posterior::as_draws_*()
functions.
+To manipulate the draws
objects use the various methods
+described in the posterior package vignettes
+and documentation.
diff --git a/docs/reference/fit-method-loo.html b/docs/reference/fit-method-loo.html
index 30cfdd3db..95111bd38 100644
--- a/docs/reference/fit-method-loo.html
+++ b/docs/reference/fit-method-loo.html
@@ -1,9 +1,9 @@
Leave-one-out cross-validation (LOO-CV) — fit-method-loo • cmdstanr
@@ -109,10 +109,10 @@ Leave-one-out cross-validation (LOO-CV)
The $loo()
method computes approximate LOO-CV using the
-loo package. This is a simple wrapper around loo::loo.array()
-provided for convenience and requires computing the pointwise
-log-likelihood in your Stan program. See the loo package
-vignettes for details.
+loo package. In order to use this method you must compute and save
+the pointwise log-likelihood in your Stan program. See loo::loo.array()
+and the loo package vignettes
+for details.
@@ -139,8 +139,14 @@ Arguments
moment_match
-(boolean) Whether to use a moment-matching correction for
-for problematic observations.
+(logical) Whether to use a
+moment-matching correction for problematic
+observations. The default is FALSE
. Using moment_match=TRUE
will result
+in compiling the additional methods described in
+fit-method-init_model_methods. This allows CmdStanR to automatically
+supply the functions for the log_lik_i
, unconstrain_pars
,
+log_prob_upars
, and log_lik_i_upars
arguments to
+loo::loo_moment_match()
.
...
@@ -153,7 +159,8 @@ Arguments
Value
-The object returned by loo::loo.array()
.
+The object returned by loo::loo.array()
or
+loo::loo_moment_match.default()
.
See also
@@ -168,15 +175,16 @@ Examples
# \dontrun{
# the "logistic" example model has "log_lik" in generated quantities
fit <- cmdstanr_example("logistic")
+#> Model executable is up to date!
loo_result <- fit$loo(cores = 2)
print(loo_result)
#>
#> Computed from 4000 by 100 log-likelihood matrix
#>
#> Estimate SE
-#> elpd_loo -63.6 4.1
+#> elpd_loo -63.7 4.1
#> p_loo 3.9 0.5
-#> looic 127.2 8.3
+#> looic 127.4 8.3
#> ------
#> Monte Carlo SE of elpd_loo is 0.0.
#>
diff --git a/docs/reference/index.html b/docs/reference/index.html
index 7c5bfb855..ad6c2b705 100644
--- a/docs/reference/index.html
+++ b/docs/reference/index.html
@@ -319,8 +319,7 @@ Fitted model objects and methods
- Return the variable skeleton needed by the utils::relist function to re-structure a
-vector of constrained parameter values to a named list
+ Return the variable skeleton for relist