Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical issues during hyperparameter optimization #611

Open
befelix opened this issue Mar 19, 2018 · 3 comments
Open

Numerical issues during hyperparameter optimization #611

befelix opened this issue Mar 19, 2018 · 3 comments

Comments

@befelix
Copy link

befelix commented Mar 19, 2018

Setting a strong beta-prior that encourages short lengthscales (e.g., Gamma(0.25, 0.5)) can lead to numerical issues during hyperparameter optimization. In particular, optimize_restarts samples hyperparameters from the hyper-prior, which can be fairly small values.

This has lead to some weird bugs when running Bayesian optimization based on GPy. In particular, I have encountered situations that are equivalent to the following code snippet, which raises an error about the kernel matrix not being positive definite.

np.random.seed(1)
X = np.ones((5, 1), dtype=np.float)
X += 1e-8 * np.random.randn(*X.shape)

gp = GPy.models.GPRegression(X, np.ones_like(X), noise_var=1e-6)

gp.kern.lengthscale = 1e-8

If you have any suggestions how to avoid this situation, I'm happy to hear it.

Also if anyone has an insight why this code causes numerical issues I would really appreciate it. I'm at a complete loss - the small random perturbation is essential for making things fail.

@mzwiessele
Copy link
Member

mzwiessele commented Mar 19, 2018 via email

@befelix
Copy link
Author

befelix commented Mar 19, 2018

Thanks for the fast reply. So the resulting fit is actually perfectly fine. I mean the function is literally constant in this simple example case. For example, the code below has the same issues. It's a constants function, but some measurements are almost duplicated. This is a setting that actually occurs quite frequently in Bayesian optimization once the algorithm has converged.

Is there some way to effectively constraint the lengthscales to a reasonable range? Something like gp.kern.lengthscale.constrain_bounded(1e-6, 1) has no effect, since the lengthscale values seem to be overwritten directly by the randomization in optimize_restarts.

np.random.seed(1)

X = np.linspace(-1, 1, 15)[:, None]

# Add some almost duplicated measurements
X[-5:] = 1
X[-5:] += 1e-8 * np.random.randn(5, 1)

gp = GPy.models.GPRegression(X, np.ones_like(X), noise_var=1e-6)

gp.likelihood.variance.constrain_fixed()
gp.kern.variance.constrain_fixed()

prior = GPy.priors.gamma_from_EV(0.5, 1)
gp.kern.lengthscale.set_prior(prior, warning=False)

gp.optimize_restarts(100)

@befelix
Copy link
Author

befelix commented Mar 23, 2018

I've solved this for my case by implementing a shifted variant of the gamma prior (and adding corresponding constraints to the HMC samplers). So from my side you can close this issue, but I think it would still be nice to automatically truncate priors to respect the constraints imposed by paramz.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants