-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaling of kernel functions #157
Comments
Hi @DOSull, volume preservation doesn’t matter in the sense of reproducing 'canonical' results, however, I'm not sure how it would or would not impact GWR estimation. I suspect not, or at least not significantly.
I think that’s a fair point. I didn’t follow the comment about none of them being parameterized by a bandwidth. GWR always uses a bandwidth to determine the distances (zs) used in those functions - I guess that's fundamentally different than the bandwidth, w, you mention. FWIW, the triangular, quadratic, and quartic functions don’t see much play in GWR. The set of functions was ported from the kernel module of the weights package and then modified to remove truncations (i.e., arbitrary removal of near-zero interactions) that existed for memory optimization. |
I think @DOSull means that, without a (1/(sqrt(2 * pi)*bw)) scaling factor, the Gaussian kernel we estimate is only proportional to the "correct" one that has that scaling factor. Imho we should keep everything volume preserving. I'm not aware of this being studied in GWR, but I bet it's been studied in GAMs? Reasoning through it, the constants should cancel out within the WLS estimator, no? However, we would get in trouble for many other estimators or statistics, so the kernel should be fully specified. |
My (limited) understanding of GWR would be that all these scaling factors will cancel out since it's all relative anyway. This was just something I came across along the way and as a good FOSS citizen thought I should mention. I came across it in the context of a round of updating of the CSR model in our NetLogo model zoo where I realised that our quartic kernel method and diffusion smoothing method were yielding different total volumes under the surface, because we were certainly using the wrong scaling factor for the quartic function option! That led me to the books, one of which was Fotheringham et al. 2002 on GWR, and from there to here. There are surprisingly few explicit statements out there of the Returning to Alternatively, as @ljwolf points out, the kernel should be fully specified on some principled basis if only so that competing implementations are consistent on such things and don't inadvertently yield different results for obscure reasons related to this kind of minor difference! |
Hello! This is an exciting Issue, at least to me. Whether the normalization of those functions matters seems (to me) to be a matter of how you've coded the rest of things, which I think @DOSull and I plead guilty to not having explored beyond the code directly in question above, either by testing or reading/reasoning. I am selfishly interested in the answer as it has some bearing on something I'm interested in seeing, feature-wise. I wonder whether I could get a spatial statistician cleverer than I am to comment on what would be needed to implement a generalized data-driven kernel option--an analogue to a |
Not really an issue, as I don't use
mgwr
(or haven't yet), more a question.Are the constant scale factors applied to the kernel functions in
_kernel_funcs(self, zs)
here correct?Perhaps more relevant, in the GWR context are they even necessary?
I ask because there is an odd mixture of constants applied to the kernels in some cases and not applied in others.
In density estimation applications of spatial kernels, volume preservation under the density surface is important. If this matters in the GWR case, then rather obviously the triangular function is not volume preserving as written given that the volume of a cone is$\pi r^2h/3$ for a cone of height $h$ and base radius $r$ . This means that a volume preserving triangular kernel of bandwidth $w$ would have height (i.e. constant multiplier) $3/\pi w^2$ . I assume since that's not applied (similarly the Gaussian and quartic kernels as written are not volume preserving, and none of the others are either, given that none of them are parameterised by a bandwidth.
They're not really kernels, which usually implies estimation of a PDF, so much as spatial interaction functions.
Anyway... if volume preservation doesn't matter then that means there's no point in applying any constant scaling factors to any of them so that the
(3. / 4)
applied to the quadratic and the(15. / 16)
applied to the quartic can safely be dropped, and some infinitesimal amount of time saved!The text was updated successfully, but these errors were encountered: