Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative sigma #85

Open
redhog opened this issue Apr 14, 2021 · 6 comments
Open

Negative sigma #85

redhog opened this issue Apr 14, 2021 · 6 comments

Comments

@redhog
Copy link
Collaborator

redhog commented Apr 14, 2021

OrdinaryKrigin.sigma can sometimes end up with very small negative numbers, like -1.514380e-15 very close to positions with original data. This does have some practical consequences since we're operating in np.float64 space:
Since OrdinaryKrigin.sigma is defined as sigma^2, it is often natural to take the square root (to get values with a more sensible unit), giving NaN in this case.

@eddjharrison
Copy link

It can also end up being nan when very close / on top of input data where it should be 0.

@mmaelicke
Copy link
Owner

Hey, thank you two for reporting!

Can any of you confirm, that is only happening very close / at observation locations? I am just wondering if we are talking about a bug in the implementation, or numerical issues that can easily be rounded away...

Maybe, at an example where this is happening, you can quickly add a print(b, l) in front of this line:

sigma = sum(b[:-1] * l[:-1]) + l[-1]

I am just wondering if it's the l[-1], so the last element in the weights array (mu - the lagrange multiplier) that becomes really small close to observation points that is causing the problems.

@redhog
Copy link
Collaborator Author

redhog commented Apr 14, 2021

It only happens close to/on top of observations that I can see. We can't really guarantee that we don't try to call transform() on the exact same coords as some observation - it's quite likely to happen even.

@mmaelicke
Copy link
Owner

OK, then as a first shot I would guess that when an observation location is in the grid passed to transform, we end up with a 0 in the kriging matrix. This will lead to a singular matrix, which can't be solved. The class does catch these exceptions and actually counts their occurrences in an internal attribute (Kriging.singular_error). In case of an exception, np.nan is returned and Kriging.sigma is not changed.
The attribute is instantiated here:

self.sigma = np.empty(len(x[0]))

Thus, we get a nan instead of a zero. We have to keep in mind, that there are other causes for a singular matrix and therefore I would not replace the nan with a zero by default, as actually, nothing was calculated.

Would it be a possible pathway for you to use the class as is and replace the result at the indexes of input coordinates (if they match) with 0?
For the very small values, we can introduce a default precision and round results, i.e. up to 6 decimal places, as 10**-15 precision does not really make sense for an interpolation. As the last step, you could then check if there are any NaNs in sigma left, that should not be there, or any negative sigmas, that lied outside the chosen precision?

@redhog
Copy link
Collaborator Author

redhog commented Apr 15, 2021

I think it would be better to do this kind of cleanup in scikit-gstat, but yes, given your explanation, we can't just replace all nans with zeros. But couldn't we use the distance matrix and check for distance < 1e-14 or something and replace all those results?

@mmaelicke
Copy link
Owner

That sounds very sensible. Maybe an even larger threshold. We should set this as an attribute to the Kriging class with a default value like that.
Besides the technical implications of rounding very small distances away, there is also a scientific one. One could question an interpolation estimate resulting from distances that are by many magnitudes smaller than the spatial resolution of the Variogram. From my point of view it does not make sense to discriminate 0.000001 m distances for Kriging, if the Variogram bins might use 1 meter lags.
So, to give the user the possibility to specify the precision of the distance matrix, makes sense anyway, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants