-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spherepoint_hash32: float8 needs wrapping into a Datum #107
Conversation
Well, zero patch number seems to be unlucky :) Having a 32 bit platform in the test pipeline might help to catch such problems. But I haven't found such platforms on GitHub Actions unfortunately. |
The embarrassing part is that I do have such a test pipeline on apt.postgresql.org, but it wasn't running for pgsphere yet: https://pgdgbuild.dus.dg-i.net/view/Snapshot/job/pgsphere-binaries-snapshot/ |
@@ -315,8 +315,8 @@ Datum | |||
spherepoint_hash32(PG_FUNCTION_ARGS) | |||
{ | |||
SPoint *p1 = (SPoint *) PG_GETARG_POINTER(0); | |||
Datum h1 = DirectFunctionCall1(hashfloat8, p1->lat); | |||
Datum h2 = DirectFunctionCall1(hashfloat8, p1->lng); | |||
Datum h1 = DirectFunctionCall1(hashfloat8, Float8GetDatum(p1->lat)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code will work, but I think it will significantly decrease the performance on 32 bit platforms. It is ok to fix the fails on 32 bit, but the function should be improved. We have to create a new Issue for this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how 64-bit floats are supposed to be handled on 32-bit platforms, what would you want to change there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be something like this? I haven't compiled it yet.
uint32 pgs_hashfloat8(double key)
{
/*
* On IEEE-float machines, minus zero and zero have different bit patterns
* but should compare as equal. We must ensure that they have the same
* hash value, which is most reliably done this way:
*/
if (key == (float8) 0)
PG_RETURN_UINT32(0);
/*
* Similarly, NaNs can have different bit patterns but they should all
* compare as equal. For backwards-compatibility reasons we force them to
* have the hash value of a standard NaN.
*/
if (isnan(key))
key = get_float8_nan();
return hash_bytes((unsigned char *) &key, sizeof(key));
}
void spherepoint_hash32()
{
SPoint *p1 = (SPoint *) PG_GETARG_POINTER(0);
const uint32 h1 = pgs_hashfloat8(p1->lng);
const uint32 h2 = pgs_hashfloat8(p1->lat);
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work on 64 bit platforms. I may create a PR.
(force-pushed) 6f9a86e
P.S. The compilation fails on 10-12 versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it seems my proposes solution doesn't work on PG 10-12 because the compilation fails. It seems, hash functions are not declared in the headers or the headers are different. It is sad that hash functions are implemented as "pg-functions", not as simple functions (they accept and return Datum). The hash operation may be called frequently. Thus, calling palloc to wrap float8 for hash calculation is not a good way on 32 bit platforms, I believe. Anyway, I propose to accept the PR and think about hash functions later in a separate Issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this proposed fix be slower on 64-bit machines at all? Not sure what the difference is with Float8GetDatum()
on 64-bit.
I'm just wondering if we should do something like
#if IS64BIT /* theoretical - it's more complicated than this, just illustrating */
... current code ...
#else
... new version of code with Float8GetDatum()
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed fix will not be slower on 64 bit platforms. I guess, It may be slightly faster, but insignificantly. On 64 bit platforms there is no difference, in general. My patch helps to fix the issue on 32 bit platform. Float8GetDatum uses palloc to pack float8 into Datum. I'm not sure that deallocation is happened until the end of the transaction, that may lead to huge memory consumption in case of huge number of hash calculations.
I'm not sure, we should use ifdef and create a different hash calculation logic. I would like to have the same calculation logic on all platforms. Furthermore, the hash calculation function takes float8 and returns uint32 types, which sizes are the same on both 32/64 bit platforms.
My proposed solution is not compiled on 12 or lesser versions. I think, that the original solution, proposed by @df7cb with some modifications (NAN, and +-0 processing) would be the better alternative It doesn't require some external headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vitcpp wrote:
Float8GetDatum uses palloc to pack float8 into Datum.
Just to clarify, does Float8GetDatum()
call palloc()
on 64-bit or only on 32-bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Float8GetDatum calls palloc on 32 bit platforms because sizeof(Datum) = 4 is not enough to store 8 bytes of float8. There is the macro USE_FLOAT8_BYVAL that defines which version to use. For 32 bits it is 1.
If USE_FLOAT8_BYVAL is not defined then the following version is used:
Datum
Float8GetDatum(float8 X)
{
float8 *retval = (float8 *) palloc(sizeof(float8));
*retval = X;
return PointerGetDatum(retval);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 32 bits we may undef ifdef emulate palloc as a temporary solution. But I do not like temporary solutions.
uint32 pgs_hashfloat8(double key)
{
#ifdef 32BIT
Datum datum = &key;
uint32 hash = DirectFunctionCall1(hashfloat8, datum);
#endif
It would avoid the palloc call, true. TBH, the current implementation is fast on 64-bit platforms, and anyone running database servers on 32-bit today should already be aware that there are limitations. Not sure we have to optimize for that. |
Dear All, I'm going to increment the patch number and create a new release artifact. Let me know please if you have some objections. |
Sorry, it seems I haven't merged the change before close. Now fixing it. |
Sorry I forgot to test on 32-bit... The new spherepoint_hash32 function crashes on platforms where float8 is passed by pointer.