frontend: effective sample size and related stats #48

magland · 2024-06-07T16:21:43Z

This adds new columns to the output Summary table

MCSE, N_eff, N_eff/s, R_hat

The calculation of these entities is subtle. The implementation in this PR was carried over from MCMC Monitor, and that was created to match a version of bayes_kit Python package at some point in time.

These values do not exactly match the output of stansummary.

We'll need to decide on which reference implementation to target, and then how to create automated tests.

WardBrian · 2024-06-13T13:31:40Z

Hi @magland - sorry for not getting this to you sooner.

The Stan computations for rhat and ess can be found at https://github.com/stan-dev/stan/tree/develop/src/stan/analyze/mcmc

There are a few options implemented even in those files -- the ones actually used by stansummary are compute_effective_sample_size for ESS, and compute_split_potential_scale_reduction for $\hat{R}$ (note the split in this function name)

magland · 2024-06-13T23:13:13Z

@WardBrian

I took a crack at translating compute_effective_sample_size() into typescript.

In the table, there are two columns (for the time being) N_Eff1 and N_Eff2. The first is the one ported from bayes_kit and the second is the one ported from Stan C++.

I don't know what the chances are that I got it right on the first attempt, but the output is within the range of reasonable... although by no means do the two columns match.

I guess next step would be to put an example through stansummary and see if it's consistent. (The other derived columns definitely won't match because I have them based on N_Eff1)

magland · 2024-06-14T11:11:25Z

I made #68 so that we can conveniently compare this N_eff2 with the output of stansummary. Unfortunately they are not quite matching. I carefully double checked the code and it appears like an accurate translation. Are we sure stansummary is using that function you linked?

I think the next step would be to either (a) build stansummary from source and put in debug print statements to see the values of the variables at each step; or (b) isolate those C++ functions into a minimal project and feed it sample draws, again with print statements, and make a test with the same input data in SP, with console.log statements. The former would be good because we would verify what exactly stansummary was calling, but the latter would be good because it leads to unit tests.

WardBrian · 2024-06-14T12:57:09Z

Now that I’m done with the stanc worker PR I will take a look at this today. I’ll see if I spot anything in the code and if not I can try the approach you describe to see where they first diverge.

WardBrian · 2024-06-14T14:16:16Z

I found the issue - Stan's code is not very clear about which calculation is a sample variance (divide by N - 1) and which is a population variance (divide by N).

I'm honestly not 100% confident that this is not a mistake in Stan, but I'd still argue it's better to match the Stan behavior

WardBrian · 2024-06-14T14:17:23Z

Oops, I pushed it to #68 originally.

magland · 2024-06-14T14:25:06Z

Excellent! I'll update the other columns, rearrange the file naming a bit and remove the bayes_kit stuff (we can restore later from mcmc-monitor if needed).

WardBrian · 2024-06-14T14:26:59Z

I wouldn't be surprised if Rhat has a similar issue when ported over -- for your notes, stan::math::variance is a sample variance, while boost::accumulators::stats::variance appears to be population

magland · 2024-06-14T14:58:13Z

Okay all the columns appear to be matching now (with the exception of stepsize__ which blows up, and I think it's okay).

If we merge this and #47 into main, then #68 would then be ready to be reviewed independently.

WardBrian · 2024-06-14T15:09:20Z

Great! I still think it’s worth writing tests and then possibly spinning the stats functions off into their own package so other people could use them, but matching stansummary is enough for me for now

WardBrian · 2024-06-14T16:33:18Z

Something is slightly off with rhat - it's not obvious with the linear regression example, but if you set the disease transmission example to 100 warmup/25 draws you can see differences. I'm trying to track down the source now

magland · 2024-06-14T16:39:40Z

Something is slightly off with rhat - it's not obvious with the linear regression example, but if you set the disease transmission example to 100 warmup/25 draws you can see differences. I'm trying to track down the source now

Maybe I used the wrong variance calc again?

… odd numbered draws

WardBrian · 2024-06-14T16:56:23Z

I think it actually had to do with the split_chains function when the number of chains was odd, which I guess means 25 was a lucky choice of number by me

Still need to do a bit more validation

WardBrian · 2024-06-14T17:59:24Z

looks like it is now matching up to 6 decimals on both ESS and Rhat

Quantiles don't always agree but there are multiple valid ways to break ties there, and I don't think the stan one is necessarily better than any other

WardBrian

A few style comments but otherwise I think this is good to go!

WardBrian · 2024-06-14T19:18:04Z

gui/src/app/SamplerOutputView/SummaryView.tsx

+    const uniqueChainIds = Array.from(new Set(chainIds)).sort();
+    const draws: number[][] = new Array(uniqueChainIds.length).fill(0).map(() => []);
+    for (let i = 0; i < x.length; i++) {
+        const chainId = chainIds[i];
+        const chainIndex = uniqueChainIds.indexOf(chainId);
+        draws[chainIndex].push(x[i]);
+    }


Thoughts on splitting this into a 'drawsByChain` function or similar? It gets duplicated just below in the rhat function, and I imagine something similar is useful for the multiple csvs

Okay I did that.

WardBrian · 2024-06-14T19:18:57Z

gui/src/app/SamplerOutputView/stan_stats/stan_stats.ts

+    // check if chains are constant; all equal to first draw's value
+    let are_all_const = false;
+    const init_draw = new Array(num_chains).fill(0);
+    for (let chain_idx = 0; chain_idx < num_chains; chain_idx++) {
+        const draw = draws[chain_idx];
+        for (let n = 0; n < num_draws; n++) {
+            if (!isFinite(draw[n])) {
+                // we can't compute ESS if there are non-finite values
+                return NaN;
+            }
+        }
+
+        init_draw[chain_idx] = draw[0];
+
+        const precision = 1e-12;
+        if (draw.every(d => Math.abs(d - draw[0]) < precision)) {
+            are_all_const = true;
+        }
+    }
+
+    if (are_all_const) {
+        // If all chains are constant then return NaN
+        // if they all equal the same constant value
+        const precision = 1e-12;
+        if (init_draw.every(d => Math.abs(d - init_draw[0]) < precision)) {
+            return NaN;
+        }
+    }


Similarly, want to pull this into a helper? I realize this is also a suggestion I could be making to the Stan implementation...

I think it's best to try to match the C++ implementation as closely as possible, even with the imperfections. If the goal is to make it cleaner, there's a lot that could be done... but I think the goal here is to reproduce exactly.

WardBrian · 2024-06-14T19:21:24Z

gui/src/app/SamplerOutputView/SummaryView.tsx

@@ -1,5 +1,6 @@
 import { FunctionComponent, useMemo } from "react"
 import { computeMean, computePercentile, computeStdDev } from "./util"
+import { compute_effective_sample_size, compute_split_potential_scale_reduction } from "./stan_stats/stan_stats"


This comment didn't post with the others - could we move the code currently in ./util into this folder and then de-duplicate it? I think we have two mean calculations now, for example

I'd like to keep stan_stats as self-contained as possible and exposing as minimum as possible, because we're planning to make this a separate package (I think it would be named something other than stan_stats of course)

I was imagining said other package would also include the quantiles, mean, and standard deviation functions. I guess it doesn’t need to, but they’re related and nice to have somewhere

I see. Yeah I think that would be nice... but would require some planning.

implement ess stats

ec71116

magland changed the base branch from main to sampling-output-take-3 June 7, 2024 16:21

Merge branch 'sampling-output-take-3' into ess-stats

bf53eef

WardBrian mentioned this pull request Jun 13, 2024

Sample output view: Stansummary equivalent #56

Closed

6 tasks

WardBrian linked an issue Jun 13, 2024 that may be closed by this pull request

Sample output view: Stansummary equivalent #56

Closed

6 tasks

try port ess compute from stan c++

bdaad30

adjust ess calc

9db52dd

magland mentioned this pull request Jun 14, 2024

frontend: export chain csvs #68

Merged

Fix ESS calculation (sample vs population variance issue)

65c7969

implement Rhat and rearrange stan stats

7e319ff

magland changed the title ~~[WIP] frontend: effective sample size and related stats~~ frontend: effective sample size and related stats Jun 14, 2024

magland mentioned this pull request Jun 14, 2024

make tests for ess and rhat computation and create a separate package #69

Open

Implement split_chains such that it has the same behavior as Stan for…

2dd1875

… odd numbered draws

WardBrian approved these changes Jun 14, 2024

View reviewed changes

WardBrian reviewed Jun 14, 2024

View reviewed changes

helper function drawsByChain

e3d6340

Base automatically changed from sampling-output-take-3 to main June 14, 2024 21:08

magland merged commit a584857 into main Jun 14, 2024

WardBrian deleted the ess-stats branch June 15, 2024 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frontend: effective sample size and related stats #48

frontend: effective sample size and related stats #48

magland commented Jun 7, 2024

WardBrian commented Jun 13, 2024

magland commented Jun 13, 2024 •

edited

Loading

magland commented Jun 14, 2024 •

edited

Loading

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian left a comment

WardBrian Jun 14, 2024

magland Jun 14, 2024

WardBrian Jun 14, 2024

magland Jun 14, 2024

WardBrian Jun 14, 2024

magland Jun 14, 2024

WardBrian Jun 14, 2024

magland Jun 14, 2024

frontend: effective sample size and related stats #48

frontend: effective sample size and related stats #48

Conversation

magland commented Jun 7, 2024

WardBrian commented Jun 13, 2024

magland commented Jun 13, 2024 • edited Loading

magland commented Jun 14, 2024 • edited Loading

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

magland commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian commented Jun 14, 2024

WardBrian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

magland commented Jun 13, 2024 •

edited

Loading

magland commented Jun 14, 2024 •

edited

Loading