Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GB2 distribution with strange quantiles for extreme parameter values #3

Open
stefan-1997 opened this issue Feb 26, 2024 · 1 comment

Comments

@stefan-1997
Copy link

stefan-1997 commented Feb 26, 2024

I fitted a GB2 distribution to my income data and then got puzzled when examining the parameter estimates. I observed that for parameters similar to c(809.6687, 116.4727, 0.0342, 0.0278), the quantile values returned by qGB2 are somewhat strange. For instance, the 10% Percentile is still 0 which does not make sense to me, in particular when having a look at the associated density function. Also, using the random number generator rGB2 many 0 values are drawn from the distribution.

I compared the results to the ones by the GB2 R package. Here, the quantile values look more reasonable to me for the very same set of GB2 parameters. For standard GB2 parameters, the two packages return identical results but not for some extreme parameter values. Why is that and might this be a potential bug in the gamlss.dist package?

Here, my example code:

##### GB2 Distribution #####

rm(list = ls())

set.seed(79)

library(GB2)
library(gamlss.dist)

params = c(809.6687, 116.4727, 0.0342, 0.0278)
# params = c(1, 1, 1, 0.5)
mu = params[1]
sigma = params[2]
nu = params[3]
tau = params[4]

GB2::qgb2(c(0.1, 0.5, 0.9), shape1 = sigma, scale = mu, shape2 = nu, shape3 = tau)
gamlss.dist::qGB2(c(0.1, 0.5, 0.9), mu = mu, sigma = sigma, nu = nu, tau = tau)

GB2::pgb2(c(10, 100, 1000, 10000), shape1 = sigma, scale = mu, shape2 = nu, shape3 = tau)
gamlss.dist::pGB2(c(10, 100, 1000, 10000), mu = mu, sigma = sigma, nu = nu, tau = tau)

GB2::dgb2(c(10, 100, 1000, 10000), shape1 = sigma, scale = mu, shape2 = nu, shape3 = tau)
gamlss.dist::dGB2(c(10, 100, 1000, 10000), mu = mu, sigma = sigma, nu = nu, tau = tau)


r1 = GB2::rgb2(100000, shape1 = sigma, scale = mu, shape2 = nu, shape3 = tau)
summary(r1)
sum(r1 == 0)
r2 = gamlss.dist::rGB2(100000, mu = mu, sigma = sigma, nu = nu, tau = tau)
summary(r2)
sum(r2 == 0)
@zeileis
Copy link
Member

zeileis commented Feb 26, 2024

Thanks, Stefan @stefan-1997, for raising this point, very useful.

The differences come from GB2 using the d/p/q/r functions for the beta distribution, i.e., qbeta() and friends, while gamlss.dist relies on the F distribution, i.e., qf() and friends. I simplified your quantile example to:

GB2::qgb2(0.1, 20, 20, 0.03, 0.02)
## [1] 1.981105
gamlss.dist::qGB2(0.1, 20, 20, 0.03, 0.02)
## [1] 0

The reason for this is that qbeta() is still stable while qf() is not:

qbeta(0.1, 0.03, 0.02)
## [1] 8.270811e-21
qf(0.1, 2 * 0.03, 2 * 0.02)
## [1] 0

Similar problems seem to occur with the corresponding "p" and "d" function where "beta" works but "f" does not in certain cases.

Whether this advantage of beta over F only applies to this case or more generally, I cannot say. But it might be worth having a closer look. Mikis @mstasinopoulos and Bob @rigbyr did you consider this when implementing GB2 in gamlss.dist?

Finally, the random number generator gamlss.dist::rGB2() uses the inversion method which is problematic for many distributions. Here, it inherits the pathologies above from qGB2(). In contrast, GB2::rgb2() directly uses rbeta() which, again, seems to be numerically more stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants