Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontend: effective sample size and related stats #48

Merged
merged 8 commits into from
Jun 14, 2024
Merged

Conversation

magland
Copy link
Collaborator

@magland magland commented Jun 7, 2024

This adds new columns to the output Summary table

MCSE, N_eff, N_eff/s, R_hat

The calculation of these entities is subtle. The implementation in this PR was carried over from MCMC Monitor, and that was created to match a version of bayes_kit Python package at some point in time.

These values do not exactly match the output of stansummary.

We'll need to decide on which reference implementation to target, and then how to create automated tests.

@magland magland changed the base branch from main to sampling-output-take-3 June 7, 2024 16:21
@WardBrian
Copy link
Collaborator

Hi @magland - sorry for not getting this to you sooner.

The Stan computations for rhat and ess can be found at https://github.com/stan-dev/stan/tree/develop/src/stan/analyze/mcmc

There are a few options implemented even in those files -- the ones actually used by stansummary are compute_effective_sample_size for ESS, and compute_split_potential_scale_reduction for $\hat{R}$ (note the split in this function name)

@WardBrian WardBrian linked an issue Jun 13, 2024 that may be closed by this pull request
6 tasks
@magland
Copy link
Collaborator Author

magland commented Jun 13, 2024

@WardBrian

I took a crack at translating compute_effective_sample_size() into typescript.

In the table, there are two columns (for the time being) N_Eff1 and N_Eff2. The first is the one ported from bayes_kit and the second is the one ported from Stan C++.

I don't know what the chances are that I got it right on the first attempt, but the output is within the range of reasonable... although by no means do the two columns match.

I guess next step would be to put an example through stansummary and see if it's consistent. (The other derived columns definitely won't match because I have them based on N_Eff1)

image

@magland
Copy link
Collaborator Author

magland commented Jun 14, 2024

I made #68 so that we can conveniently compare this N_eff2 with the output of stansummary. Unfortunately they are not quite matching. I carefully double checked the code and it appears like an accurate translation. Are we sure stansummary is using that function you linked?

I think the next step would be to either (a) build stansummary from source and put in debug print statements to see the values of the variables at each step; or (b) isolate those C++ functions into a minimal project and feed it sample draws, again with print statements, and make a test with the same input data in SP, with console.log statements. The former would be good because we would verify what exactly stansummary was calling, but the latter would be good because it leads to unit tests.

@WardBrian
Copy link
Collaborator

Now that I’m done with the stanc worker PR I will take a look at this today. I’ll see if I spot anything in the code and if not I can try the approach you describe to see where they first diverge.

@WardBrian
Copy link
Collaborator

I found the issue - Stan's code is not very clear about which calculation is a sample variance (divide by N - 1) and which is a population variance (divide by N).

I'm honestly not 100% confident that this is not a mistake in Stan, but I'd still argue it's better to match the Stan behavior

@WardBrian
Copy link
Collaborator

Oops, I pushed it to #68 originally.

@magland
Copy link
Collaborator Author

magland commented Jun 14, 2024

Excellent! I'll update the other columns, rearrange the file naming a bit and remove the bayes_kit stuff (we can restore later from mcmc-monitor if needed).

@WardBrian
Copy link
Collaborator

I wouldn't be surprised if Rhat has a similar issue when ported over -- for your notes, stan::math::variance is a sample variance, while boost::accumulators::stats::variance appears to be population

@magland
Copy link
Collaborator Author

magland commented Jun 14, 2024

Okay all the columns appear to be matching now (with the exception of stepsize__ which blows up, and I think it's okay).

If we merge this and #47 into main, then #68 would then be ready to be reviewed independently.

@magland magland changed the title [WIP] frontend: effective sample size and related stats frontend: effective sample size and related stats Jun 14, 2024
@WardBrian
Copy link
Collaborator

Great! I still think it’s worth writing tests and then possibly spinning the stats functions off into their own package so other people could use them, but matching stansummary is enough for me for now

@WardBrian
Copy link
Collaborator

Something is slightly off with rhat - it's not obvious with the linear regression example, but if you set the disease transmission example to 100 warmup/25 draws you can see differences. I'm trying to track down the source now

@magland
Copy link
Collaborator Author

magland commented Jun 14, 2024

Something is slightly off with rhat - it's not obvious with the linear regression example, but if you set the disease transmission example to 100 warmup/25 draws you can see differences. I'm trying to track down the source now

Maybe I used the wrong variance calc again?

@WardBrian
Copy link
Collaborator

I think it actually had to do with the split_chains function when the number of chains was odd, which I guess means 25 was a lucky choice of number by me

Still need to do a bit more validation

@WardBrian
Copy link
Collaborator

looks like it is now matching up to 6 decimals on both ESS and Rhat

Quantiles don't always agree but there are multiple valid ways to break ties there, and I don't think the stan one is necessarily better than any other

Copy link
Collaborator

@WardBrian WardBrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few style comments but otherwise I think this is good to go!

Comment on lines 156 to 162
const uniqueChainIds = Array.from(new Set(chainIds)).sort();
const draws: number[][] = new Array(uniqueChainIds.length).fill(0).map(() => []);
for (let i = 0; i < x.length; i++) {
const chainId = chainIds[i];
const chainIndex = uniqueChainIds.indexOf(chainId);
draws[chainIndex].push(x[i]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on splitting this into a 'drawsByChain` function or similar? It gets duplicated just below in the rhat function, and I imagine something similar is useful for the multiple csvs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I did that.

Comment on lines +43 to +70
// check if chains are constant; all equal to first draw's value
let are_all_const = false;
const init_draw = new Array(num_chains).fill(0);
for (let chain_idx = 0; chain_idx < num_chains; chain_idx++) {
const draw = draws[chain_idx];
for (let n = 0; n < num_draws; n++) {
if (!isFinite(draw[n])) {
// we can't compute ESS if there are non-finite values
return NaN;
}
}

init_draw[chain_idx] = draw[0];

const precision = 1e-12;
if (draw.every(d => Math.abs(d - draw[0]) < precision)) {
are_all_const = true;
}
}

if (are_all_const) {
// If all chains are constant then return NaN
// if they all equal the same constant value
const precision = 1e-12;
if (init_draw.every(d => Math.abs(d - init_draw[0]) < precision)) {
return NaN;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, want to pull this into a helper? I realize this is also a suggestion I could be making to the Stan implementation...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to try to match the C++ implementation as closely as possible, even with the imperfections. If the goal is to make it cleaner, there's a lot that could be done... but I think the goal here is to reproduce exactly.

@@ -1,5 +1,6 @@
import { FunctionComponent, useMemo } from "react"
import { computeMean, computePercentile, computeStdDev } from "./util"
import { compute_effective_sample_size, compute_split_potential_scale_reduction } from "./stan_stats/stan_stats"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment didn't post with the others - could we move the code currently in ./util into this folder and then de-duplicate it? I think we have two mean calculations now, for example

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep stan_stats as self-contained as possible and exposing as minimum as possible, because we're planning to make this a separate package (I think it would be named something other than stan_stats of course)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining said other package would also include the quantiles, mean, and standard deviation functions. I guess it doesn’t need to, but they’re related and nice to have somewhere

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yeah I think that would be nice... but would require some planning.

Base automatically changed from sampling-output-take-3 to main June 14, 2024 21:08
@magland magland merged commit a584857 into main Jun 14, 2024
@WardBrian WardBrian deleted the ess-stats branch June 15, 2024 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sample output view: Stansummary equivalent
2 participants