Numerical issues during hyperparameter optimization #611

befelix · 2018-03-19T14:59:46Z

Setting a strong beta-prior that encourages short lengthscales (e.g., Gamma(0.25, 0.5)) can lead to numerical issues during hyperparameter optimization. In particular, optimize_restarts samples hyperparameters from the hyper-prior, which can be fairly small values.

This has lead to some weird bugs when running Bayesian optimization based on GPy. In particular, I have encountered situations that are equivalent to the following code snippet, which raises an error about the kernel matrix not being positive definite.

np.random.seed(1)
X = np.ones((5, 1), dtype=np.float)
X += 1e-8 * np.random.randn(*X.shape)

gp = GPy.models.GPRegression(X, np.ones_like(X), noise_var=1e-6)

gp.kern.lengthscale = 1e-8

If you have any suggestions how to avoid this situation, I'm happy to hear it.

Also if anyone has an insight why this code causes numerical issues I would really appreciate it. I'm at a complete loss - the small random perturbation is essential for making things fail.

The text was updated successfully, but these errors were encountered:

mzwiessele · 2018-03-19T15:14:59Z

You can add a white noise kernel, although there is a constant jitter added, which should prevent most non pos errors. It seems you are getting into such high instability, that I’d question the GP fit either way.

…

On 19. Mar 2018, at 15:59, Felix Berkenkamp ***@***.***> wrote: Setting a strong beta-prior that encourages short lengthscales (e.g., Gamma(0.25, 0.5)) can lead to numerical issues during hyperparameter optimization. In particular, optimize_restarts samples hyperparameters from the hyper-prior, which can be fairly small values. This has lead to some weird bugs when running Bayesian optimization based on GPy. In particular, I have encountered situations that are equivalent to the following code snippet, which raises an error about the kernel matrix not being positive definite. np.random.seed(1) X = np.ones((5, 1), dtype=np.float) X += 1e-8 * np.random.randn(*X.shape) gp = GPy.models.GPRegression(X, np.ones_like(X), noise_var=1e-6) gp.kern.lengthscale = 1e-8 If you have any suggestions how to avoid this situation, I'm happy to hear it. Also if anyone has an insight why this code causes numerical issues I would really appreciate it. I'm at a complete loss - the small random perturbation is essential for making things fail. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

befelix · 2018-03-19T15:32:23Z

Thanks for the fast reply. So the resulting fit is actually perfectly fine. I mean the function is literally constant in this simple example case. For example, the code below has the same issues. It's a constants function, but some measurements are almost duplicated. This is a setting that actually occurs quite frequently in Bayesian optimization once the algorithm has converged.

Is there some way to effectively constraint the lengthscales to a reasonable range? Something like gp.kern.lengthscale.constrain_bounded(1e-6, 1) has no effect, since the lengthscale values seem to be overwritten directly by the randomization in optimize_restarts.

np.random.seed(1)

X = np.linspace(-1, 1, 15)[:, None]

# Add some almost duplicated measurements
X[-5:] = 1
X[-5:] += 1e-8 * np.random.randn(5, 1)

gp = GPy.models.GPRegression(X, np.ones_like(X), noise_var=1e-6)

gp.likelihood.variance.constrain_fixed()
gp.kern.variance.constrain_fixed()

prior = GPy.priors.gamma_from_EV(0.5, 1)
gp.kern.lengthscale.set_prior(prior, warning=False)

gp.optimize_restarts(100)

befelix · 2018-03-23T07:59:15Z

I've solved this for my case by implementing a shifted variant of the gamma prior (and adding corresponding constraints to the HMC samplers). So from my side you can close this issue, but I think it would still be nice to automatically truncate priors to respect the constraints imposed by paramz.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical issues during hyperparameter optimization #611

Numerical issues during hyperparameter optimization #611

befelix commented Mar 19, 2018

mzwiessele commented Mar 19, 2018 via email

befelix commented Mar 19, 2018

befelix commented Mar 23, 2018

Numerical issues during hyperparameter optimization #611

Numerical issues during hyperparameter optimization #611

Comments

befelix commented Mar 19, 2018

mzwiessele commented Mar 19, 2018 via email

befelix commented Mar 19, 2018

befelix commented Mar 23, 2018