diff --git a/docs/articles/posterior.html b/docs/articles/posterior.html index 0e0439f75..24f6b6a34 100644 --- a/docs/articles/posterior.html +++ b/docs/articles/posterior.html @@ -144,72 +144,74 @@

Working with Posteriors

NOTE: the hack below of using print.data.frame in chunks with echo=FALSE is used because the pillar formatting of posterior draws_summary objects isn't playing nicely with pkgdown::build_articles(). When that is fixed -using options(digits=2) won't be necessary anymore. +using options(digits=2) also won't be necessary anymore. -->
-

Summary +

Summary statistics

We can easily customise the summary statistics reported by $summary() and $print().

 fit <- cmdstanr::cmdstanr_example("schools", method = "sample")
 fit$summary()
-
Warning: 130 of 4000 (3.0%) transitions ended with a divergence.
+
Warning: 302 of 4000 (8.0%) transitions ended with a divergence.
 See https://mc-stan.org/misc/warnings for details.
-
   variable  mean median  sd mad     q5 q95 rhat ess_bulk ess_tail
-1      lp__ -58.9  -59.2 5.0 5.1 -66.97 -50    1      224       84
-2        mu   6.6    6.7 4.3 4.3  -0.55  14    1      394      115
-3       tau   5.8    5.0 3.7 3.4   1.33  13    1      223       92
-4  theta[1]   9.7    9.0 7.3 6.3  -0.99  23    1     1066     2034
-5  theta[2]   7.0    6.8 5.9 5.6  -2.54  16    1      900     2321
-6  theta[3]   5.5    5.8 6.9 6.1  -6.25  16    1      841     2201
-7  theta[4]   6.8    6.9 6.2 6.0  -3.13  17    1      781     2193
-8  theta[5]   4.6    5.0 6.0 5.8  -5.63  14    1      513      940
-9  theta[6]   5.5    5.7 6.3 5.6  -5.61  15    1      782     1784
-10 theta[7]   9.5    9.1 6.2 5.8   0.14  20    1      882     2164
-11 theta[8]   7.1    7.0 7.3 6.3  -4.80  18    1      976     2151
+
Warning: 1 of 4 chains had an E-BFMI less than 0.2.
+See https://mc-stan.org/misc/warnings for details.
+
   variable  mean median  sd mad      q5 q95 rhat ess_bulk ess_tail
+1      lp__ -56.7  -57.1 6.1 6.4 -66.008 -46  1.1       33       19
+2        mu   6.6    6.7 4.2 4.5   0.025  13  1.0      124      879
+3       tau   4.7    3.9 3.6 3.4   0.790  12  1.1       33       22
+4  theta[1]   9.1    8.5 6.8 6.0  -0.082  21  1.0      163      405
+5  theta[2]   7.0    6.9 5.5 5.8  -1.456  16  1.0      236     1810
+6  theta[3]   5.7    6.1 6.4 6.2  -4.835  16  1.0      313     1466
+7  theta[4]   6.8    6.9 5.9 5.8  -2.355  16  1.0      246     1356
+8  theta[5]   4.9    5.2 5.8 5.5  -5.038  13  1.0      219      932
+9  theta[6]   5.7    5.9 5.8 5.8  -4.073  15  1.0      278     1208
+10 theta[7]   8.9    8.6 6.0 5.7   0.094  19  1.0      176      352
+11 theta[8]   7.0    7.1 6.5 5.9  -3.088  18  1.0      317     1709

By default all variables are summaries with the follow functions:

-
+
 
[1] "mean"      "median"    "sd"        "mad"       "quantile2"

To change the variables summarised, we use the variables argument

-
+
 fit$summary(variables = c("mu", "tau"))
  variable mean median  sd mad    q5 q95 rhat ess_bulk ess_tail
-1       mu  6.6    6.7 4.3 4.3 -0.55  14    1      394      115
-2      tau  5.8    5.0 3.7 3.4  1.33  13    1      223       92
+1 mu 6.6 6.7 4.2 4.5 0.025 13 1.0 124 879 +2 tau 4.7 3.9 3.6 3.4 0.790 12 1.1 33 22

We can additionally change which functions are used

-
+
 fit$summary(variables = c("mu", "tau"), mean, sd)
  variable mean  sd
-1       mu  6.6 4.3
-2      tau  5.8 3.7
+1 mu 6.6 4.2 +2 tau 4.7 3.6

To summarise all variables with non-default functions, it is necessary to set explicitly set the variables argument, either to NULL or the full vector of variable names.

-
+
 fit$metadata()$model_params
 fit$summary(variables = NULL, "mean", "median")
 [1] "lp__"     "mu"       "tau"      "theta[1]" "theta[2]" "theta[3]"
  [7] "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
   variable  mean median
-1      lp__ -58.9  -59.2
+1      lp__ -56.7  -57.1
 2        mu   6.6    6.7
-3       tau   5.8    5.0
-4  theta[1]   9.7    9.0
-5  theta[2]   7.0    6.8
-6  theta[3]   5.5    5.8
+3       tau   4.7    3.9
+4  theta[1]   9.1    8.5
+5  theta[2]   7.0    6.9
+6  theta[3]   5.7    6.1
 7  theta[4]   6.8    6.9
-8  theta[5]   4.6    5.0
-9  theta[6]   5.5    5.7
-10 theta[7]   9.5    9.1
-11 theta[8]   7.1    7.0
+8 theta[5] 4.9 5.2 +9 theta[6] 5.7 5.9 +10 theta[7] 8.9 8.6 +11 theta[8] 7.0 7.1

Summary functions can be specified by character string, function, or using a formula (or anything else supported by [rlang::as_function]). If these arguments are named, those names will be used in the tibble output. If the summary results are named they will take precedence.

-
-my_sd <- function(x) c(My_SD = sd(x))
+
+my_sd <- function(x) c(My_SD = sd(x))
 fit$summary(
   c("mu", "tau"), 
   MEAN = mean, 
@@ -218,45 +220,45 @@ 

Summary ~quantile(.x, probs = c(0.1, 0.9)), Minimum = function(x) min(x) )

-
  variable MEAN median My_SD  10% 90% Minimum
-1       mu  6.6    6.7   4.3 0.98  12   -11.7
-2      tau  5.8    5.0   3.7 1.81  11     0.9
+
  variable MEAN median My_SD 10%  90% Minimum
+1       mu  6.6    6.7   4.2 1.3 11.7  -11.23
+2      tau  4.7    3.9   3.6 1.1  9.6    0.53

Arguments to all summary functions can also be specified with .args.

-
+
 fit$summary(c("mu", "tau"), quantile, .args = list(probs = c(0.025, .05, .95, .975)))
-
  variable 2.5%    5% 95% 97.5%
-1       mu -2.0 -0.55  14    15
-2      tau  1.1  1.33  13    15
+
  variable  2.5%    5% 95% 97.5%
+1       mu -1.17 0.025  13    15
+2      tau  0.59 0.790  12    13

The summary functions are applied to the array of sample values, with dimension iter_samplingxchains.

-
+
 fit$summary(variables = NULL, dim, colMeans)
   variable dim.1 dim.2     1     2     3     4
-1      lp__  1000     4 -58.8 -58.4 -59.0 -59.4
-2        mu  1000     4   6.8   6.7   6.6   6.1
-3       tau  1000     4   5.7   5.6   5.7   6.1
-4  theta[1]  1000     4   9.9   9.5   9.8   9.5
-5  theta[2]  1000     4   7.4   7.2   7.0   6.3
-6  theta[3]  1000     4   5.8   5.7   5.6   4.8
-7  theta[4]  1000     4   6.9   6.7   7.0   6.7
-8  theta[5]  1000     4   4.9   4.8   4.6   4.1
-9  theta[6]  1000     4   5.7   5.8   5.6   4.8
-10 theta[7]  1000     4   9.6   9.8   9.4   9.2
-11 theta[8]  1000     4   7.0   7.3   7.0   7.0
+1 lp__ 1000 4 -58.0 -55.7 -55.7 -57.3 +2 mu 1000 4 6.9 7.5 5.4 6.8 +3 tau 1000 4 5.2 4.3 4.4 4.9 +4 theta[1] 1000 4 10.0 9.7 7.6 9.1 +5 theta[2] 1000 4 7.1 8.0 5.8 7.2 +6 theta[3] 1000 4 5.7 6.6 4.5 5.9 +7 theta[4] 1000 4 7.2 7.7 5.6 6.7 +8 theta[5] 1000 4 4.9 6.0 4.0 4.9 +9 theta[6] 1000 4 5.7 6.7 4.8 5.7 +10 theta[7] 1000 4 9.3 9.5 7.5 9.2 +11 theta[8] 1000 4 7.0 8.0 5.9 7.0

For this reason users may have unexpected results if they use stats::var() directly, as it will return a covariance matrix. An alternative is the distributional::variance() function, which can also be accessed via posterior::variance().

-
-fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x)))
+
+fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x)))
  variable posterior::variance ~var(as.vector(.x))
-1       mu                  19                  19
-2      tau                  14                  14
+1 mu 18 18 +2 tau 13 13

Summary functions need not be numeric, but these won’t work with $print().

-
+
 strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
 fit$summary(variables = NULL, "Strictly Positive" = strict_pos)
 # fit$print(variables = NULL, "Strictly Positive" = strict_pos)
@@ -274,6 +276,66 @@

Summary

For more information, see posterior::summarise_draws(), which is called by $summary().

+
+
+

Extracting posterior draws/samples +

+

The $draws() +method can be used to extract the posterior draws in formats provided by +the posterior +package. Here we demonstrate only the draws_array and +draws_df formats, but the posterior +package supports other useful formats as well.

+
+# default is a 3-D draws_array object from the posterior package
+# iterations x chains x variables
+draws_arr <- fit$draws() # or format="array"
+str(draws_arr)
+
 'draws_array' num [1:1000, 1:4, 1:11] -66.1 -68.2 -67.1 -62.4 -65.6 ...
+ - attr(*, "dimnames")=List of 3
+  ..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
+  ..$ chain    : chr [1:4] "1" "2" "3" "4"
+  ..$ variable : chr [1:11] "lp__" "mu" "tau" "theta[1]" ...
+
+# draws x variables data frame
+draws_df <- fit$draws(format = "df")
+str(draws_df)
+
draws_df [4,000 × 14] (S3: draws_df/draws/tbl_df/tbl/data.frame)
+ $ lp__      : num [1:4000] -66.1 -68.2 -67.1 -62.4 -65.6 ...
+ $ mu        : num [1:4000] -2.42 9.44 2.99 2.91 6.73 ...
+ $ tau       : num [1:4000] 12.21 6.46 17.66 8.04 8.8 ...
+ $ theta[1]  : num [1:4000] 5.57 11.03 -2.77 1.5 8.91 ...
+ $ theta[2]  : num [1:4000] 6.97 3.31 6.77 12.84 5.79 ...
+ $ theta[3]  : num [1:4000] 8.21 15.21 -8.08 -5.34 -19.54 ...
+ $ theta[4]  : num [1:4000] 19.75 19.47 -7.42 -5.76 7.54 ...
+ $ theta[5]  : num [1:4000] -4.12 -5.77 6.01 5.63 -3.23 ...
+ $ theta[6]  : num [1:4000] -4.03 2.55 2.99 2.86 15.21 ...
+ $ theta[7]  : num [1:4000] -0.186 -2.004 10.11 7.803 14.427 ...
+ $ theta[8]  : num [1:4000] 0.0702 -3.005 11.0116 14.5279 14.1928 ...
+ $ .chain    : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
+ $ .iteration: int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
+ $ .draw     : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
+
+print(draws_df)
+
# A draws_df: 1000 iterations, 4 chains, and 11 variables
+   lp__   mu  tau theta[1] theta[2] theta[3] theta[4] theta[5]
+1   -66 -2.4 12.2      5.6     6.97      8.2    19.75    -4.12
+2   -68  9.4  6.5     11.0     3.31     15.2    19.47    -5.77
+3   -67  3.0 17.7     -2.8     6.77     -8.1    -7.42     6.01
+4   -62  2.9  8.0      1.5    12.84     -5.3    -5.76     5.63
+5   -66  6.7  8.8      8.9     5.79    -19.5     7.54    -3.23
+6   -64  5.3 11.4     18.5    13.37     -1.4    15.97    -0.61
+7   -60  7.3  9.1      8.6     7.82      3.5     0.34    -2.00
+8   -60  6.3  8.5      7.7     0.51      7.5    -0.99     1.51
+9   -59  1.9  6.9      3.0     8.63      1.4     3.70     4.72
+10  -63  9.1  9.3     16.0     5.77      3.9     4.14   -10.34
+# ... with 3990 more draws, and 3 more variables
+# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
+

To convert an existing draws object to a different format use the +posterior::as_draws_*() functions.

+

To manipulate the draws objects use the various methods +described in the posterior package vignettes +and documentation.

diff --git a/docs/reference/fit-method-loo.html b/docs/reference/fit-method-loo.html index 30cfdd3db..95111bd38 100644 --- a/docs/reference/fit-method-loo.html +++ b/docs/reference/fit-method-loo.html @@ -1,9 +1,9 @@ Leave-one-out cross-validation (LOO-CV) — fit-method-loo • cmdstanr @@ -109,10 +109,10 @@

Leave-one-out cross-validation (LOO-CV)

The $loo() method computes approximate LOO-CV using the -loo package. This is a simple wrapper around loo::loo.array() -provided for convenience and requires computing the pointwise -log-likelihood in your Stan program. See the loo package -vignettes for details.

+loo package. In order to use this method you must compute and save +the pointwise log-likelihood in your Stan program. See loo::loo.array() +and the loo package vignettes +for details.

@@ -139,8 +139,14 @@

Arguments

moment_match
-

(boolean) Whether to use a moment-matching correction for -for problematic observations.

+

(logical) Whether to use a +moment-matching correction for problematic +observations. The default is FALSE. Using moment_match=TRUE will result +in compiling the additional methods described in +fit-method-init_model_methods. This allows CmdStanR to automatically +supply the functions for the log_lik_i, unconstrain_pars, +log_prob_upars, and log_lik_i_upars arguments to +loo::loo_moment_match().

...
@@ -153,7 +159,8 @@

Arguments

Value

-

The object returned by loo::loo.array().

+

The object returned by loo::loo.array() or +loo::loo_moment_match.default().

See also

@@ -168,15 +175,16 @@

Examples

# \dontrun{ # the "logistic" example model has "log_lik" in generated quantities fit <- cmdstanr_example("logistic") +#> Model executable is up to date! loo_result <- fit$loo(cores = 2) print(loo_result) #> #> Computed from 4000 by 100 log-likelihood matrix #> #> Estimate SE -#> elpd_loo -63.6 4.1 +#> elpd_loo -63.7 4.1 #> p_loo 3.9 0.5 -#> looic 127.2 8.3 +#> looic 127.4 8.3 #> ------ #> Monte Carlo SE of elpd_loo is 0.0. #> diff --git a/docs/reference/index.html b/docs/reference/index.html index 7c5bfb855..ad6c2b705 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -319,8 +319,7 @@

Fitted model objects and methods

variable_skeleton()

-

Return the variable skeleton needed by the utils::relist function to re-structure a -vector of constrained parameter values to a named list

+

Return the variable skeleton for relist

expose_functions()

diff --git a/vignettes/posterior.Rmd b/vignettes/posterior.Rmd index 465391d8b..cb54a14a7 100644 --- a/vignettes/posterior.Rmd +++ b/vignettes/posterior.Rmd @@ -25,7 +25,7 @@ using options(digits=2) also won't be necessary anymore. options(digits=2) ``` -## Summary +## Summary statistics We can easily customise the summary statistics reported by `$summary()` and `$print()`. @@ -135,3 +135,32 @@ print.data.frame(fit$summary(variables = NULL, "Strictly Positive" = strict_pos) ``` For more information, see `posterior::summarise_draws()`, which is called by `$summary()`. + + +## Extracting posterior draws/samples + +The [`$draws()`](https://mc-stan.org/cmdstanr/reference/fit-method-draws.html) +method can be used to extract the posterior draws in formats provided by the +[**posterior**](https://mc-stan.org/posterior/) package. Here we demonstrate +only the `draws_array` and `draws_df` formats, but the **posterior** package +supports other useful formats as well. + +```{r draws, message=FALSE} +# default is a 3-D draws_array object from the posterior package +# iterations x chains x variables +draws_arr <- fit$draws() # or format="array" +str(draws_arr) + +# draws x variables data frame +draws_df <- fit$draws(format = "df") +str(draws_df) +print(draws_df) +``` + +To convert an existing draws object to a different format use the +`posterior::as_draws_*()` functions. + +To manipulate the `draws` objects use the various methods described in the +posterior package [vignettes](https://mc-stan.org/posterior/articles/index.html) +and [documentation](https://mc-stan.org/posterior/reference/index.html). +