[Feature request] Specify models with mathematical notation #1731

stefanocoretta · 2025-01-29T10:18:18Z

When introducing simple Gaussian models like $y \sim Gaussian(\mu, \sigma)$, students get confused by the R syntax/specification

brm(y ~ 1, family = gaussian)

because regression models proper (with one predictor) are introduced after (at least in the approach I take).

It would be nice (and pedagogically easier) if one could specify a Gaussian model like so:

m <- brm(y ~ Gaussian(mu, sigma))
# alternatively? brm(bf(y ~ Gaussian(mu, sigma))

summary(m)

  Model: RT ~ Gaussian(mu, sigma)
  Links: mu = identity; sigma = identity 
   Data: mald (Number of observations: 5000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

       Estimate Est.Error l-80% CI u-80% CI Rhat Bulk_ESS Tail_ESS
mu      1010.50      4.45  1004.74  1016.14 1.00     3628     2476
sigma    317.88      3.17   313.84   321.89 1.00     4064     2571

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

I assume it wouldn't be trivial to do this for all types of models, so would be awesome to have this even for just a simple Gaussian model with no predictors (or maybe also for y ~ x?)

The text was updated successfully, but these errors were encountered:

lunafazio · 2025-01-30T18:57:22Z

Interesting idea. I agree that having a closer fit between the notation in math vs brms could be helpful for teaching. On the other hand, I don't see the value in making it a one-of for just a covariate-free gaussian, since that's knowledge that won't generalize and it will just confuse them anyway when they have to switch to an entirely different notation down the line.

I think extending that notation to a normal linear model is easy enough, something like:

bf(
    y ~ Gaussian(mu, sigma),
    mu = 1 + x1 + x2
)

If we can have that, then it would also be great to extend it to GLMs:

bf(
    y ~ Poisson(lambda),
    log(lambda) = 1 + x1 + x2
)

But I'm not sure that R will play nicely with something that looks like a function call to the left of =. It may be that ~ then has to be used everywhere (introducing a mismatch with the way the math is written) or that one must use the inverse of the link function instead (which breaks statistical convention, though it's not one I particularly like).

jebyrnes · 2025-01-30T19:09:45Z

You might want to look at bbmle which did something like this...

stefanocoretta · 2025-01-31T11:49:55Z

@lunafazio It is true that it doesn't generalise, but pedagogically that worries me less because as soon as I introduce a regression proper (y ~ x where x is continuous) then they learn about intercepts and slopes and then the syntax and the summary make more sense.

Not saying it wouldn't be nice to be able to use math notation for all models, but it might be overkill to implement. rethinking does do that and it goes all the way down by also specifying coefficients, not just variables/terms (following code just for illustration, not working):

ulam(
  alist(
    log_gdp_std ~ dnorm(mu, sigma),
    mu <- a[cid] + b[cid] * (rugged_std - 0.215),
    a[cid] ~ dnorm(1, 0.1),
    b[cid] ~ dnorm(0, 0.3),
    sigma ~ dexp(1)
  )
)

venpopov · 2025-02-05T15:57:43Z

I made a similar suggestion a while ago: #1596

also see #1660 with related notes

venpopov · 2025-02-05T16:02:38Z

But I'm not sure that R will play nicely with something that looks like a function call to the left of =. It may be that ~ then has to be used everywhere (introducing a mismatch with the way the math is written) or that one must use the inverse of the link function instead (which breaks statistical convention, though it's not one I particularly like).

not out-of-the box, but because R uses lazy evaluation of function arguments, it can be done using some standard R tools for diffusing arguments and parsing the diffused expressions

paul-buerkner added the feature label Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Specify models with mathematical notation #1731

[Feature request] Specify models with mathematical notation #1731

stefanocoretta commented Jan 29, 2025 •

edited

Loading

lunafazio commented Jan 30, 2025

jebyrnes commented Jan 30, 2025

stefanocoretta commented Jan 31, 2025 •

edited

Loading

venpopov commented Feb 5, 2025

venpopov commented Feb 5, 2025

[Feature request] Specify models with mathematical notation #1731

[Feature request] Specify models with mathematical notation #1731

Comments

stefanocoretta commented Jan 29, 2025 • edited Loading

lunafazio commented Jan 30, 2025

jebyrnes commented Jan 30, 2025

stefanocoretta commented Jan 31, 2025 • edited Loading

venpopov commented Feb 5, 2025

venpopov commented Feb 5, 2025

stefanocoretta commented Jan 29, 2025 •

edited

Loading

stefanocoretta commented Jan 31, 2025 •

edited

Loading