Skip to content

Commit

Permalink
update about rank statistics
Browse files Browse the repository at this point in the history
  • Loading branch information
SamuelBrand1 committed Feb 12, 2025
1 parent da3750b commit a1adf7b
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 5 deletions.
10 changes: 6 additions & 4 deletions forecasttools/sbc.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,13 @@ def __init__(
Positional arguments passed to `numpyro.sample`.
num_simulations : int
How many simulations to run for SBC.
sample_kwargs : dict[str] -> Any
sample_kwargs : dict[str, Any]
Arguments passed to `numpyro.sample`. Defaults to
`dict(num_warmup=500, num_samples=100, progress_bar = False)`.
Which assumes a MCMC sampler e.g. NUTS.
seed : random.PRNGKey
Random seed.
kwargs : dict
kwargs : dict[str, Any]
Keyword arguments passed to `numpyro` models.
"""
if sample_kwargs is None:
Expand Down Expand Up @@ -149,17 +149,19 @@ def run_simulations(self):
for name in prior:
num_dims = jnp.ndim(prior_draw[name])
if num_dims == 0:
self.simulations[name].append(
rank_statistics = (
(posterior[name].sel(chain=0) < prior_draw[name])
.sum()
.values
)
self.simulations[name].append(rank_statistics)
else:
self.simulations[name].append(
rank_statistics = (
(posterior[name].sel(chain=0) < prior_draw[name])
.sum(axis=0)
.values
)
self.simulations[name].append(rank_statistics)
self._simulations_complete += 1
progress.update()
finally:
Expand Down
2 changes: 1 addition & 1 deletion notebooks/sbc_model_checking.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ More precisely, in SBC:
3. Generate the probability integral transform (PIT) of each of $k = 1,\dots,P$ parameters $\theta[k]$ with respect to the known prior distribution.
4. Assess the PIT distributions for deviation against an assumption of being $\mathcal{U}([0,1])$.

In `SBC` we from the PIT by looking at the distribution of proportion of posterior samples that are less than the "true" parameter values cached along with the generated data $y^{(i)}$: $P(\theta_p^{(i)}[k] < \theta^{(i)}[k])$. This is a more convenient approach in `numpyro` than trying to solve the inverse distribution function directly.
In `SBC` we form the PIT by looking at the distribution of proportion of posterior samples that are less than the "true" parameter values cached along with the generated data $y^{(i)}$: $P(\theta_p^{(i)}[k] < \theta^{(i)}[k])$. These are commonly called the rank statistics of the sampling process. Using rank statistics is a more convenient approach in `numpyro` than trying to solve the inverse distribution function directly.

## Example: Eight schools

Expand Down

0 comments on commit a1adf7b

Please sign in to comment.