spherepoint_hash32: float8 needs wrapping into a Datum #107

df7cb · 2023-11-15T10:29:56Z

Sorry I forgot to test on 32-bit... The new spherepoint_hash32 function crashes on platforms where float8 is passed by pointer.

vitcpp · 2023-11-15T11:06:16Z

Well, zero patch number seems to be unlucky :) Having a 32 bit platform in the test pipeline might help to catch such problems. But I haven't found such platforms on GitHub Actions unfortunately.

df7cb · 2023-11-15T11:26:48Z

The embarrassing part is that I do have such a test pipeline on apt.postgresql.org, but it wasn't running for pgsphere yet:

https://pgdgbuild.dus.dg-i.net/view/Snapshot/job/pgsphere-binaries-snapshot/

vitcpp · 2023-11-15T12:57:10Z

src/point.c

@@ -315,8 +315,8 @@ Datum
 spherepoint_hash32(PG_FUNCTION_ARGS)
 {
 	SPoint	   *p1 = (SPoint *) PG_GETARG_POINTER(0);
-	Datum		h1 = DirectFunctionCall1(hashfloat8, p1->lat);
-	Datum		h2 = DirectFunctionCall1(hashfloat8, p1->lng);
+	Datum		h1 = DirectFunctionCall1(hashfloat8, Float8GetDatum(p1->lat));


I think the code will work, but I think it will significantly decrease the performance on 32 bit platforms. It is ok to fix the fails on 32 bit, but the function should be improved. We have to create a new Issue for this problem.

That's how 64-bit floats are supposed to be handled on 32-bit platforms, what would you want to change there?

May be something like this? I haven't compiled it yet.

uint32 pgs_hashfloat8(double key) { /* * On IEEE-float machines, minus zero and zero have different bit patterns * but should compare as equal. We must ensure that they have the same * hash value, which is most reliably done this way: */ if (key == (float8) 0) PG_RETURN_UINT32(0); /* * Similarly, NaNs can have different bit patterns but they should all * compare as equal. For backwards-compatibility reasons we force them to * have the hash value of a standard NaN. */ if (isnan(key)) key = get_float8_nan(); return hash_bytes((unsigned char *) &key, sizeof(key)); } void spherepoint_hash32() { SPoint *p1 = (SPoint *) PG_GETARG_POINTER(0); const uint32 h1 = pgs_hashfloat8(p1->lng); const uint32 h2 = pgs_hashfloat8(p1->lat); ... }

It seems to work on 64 bit platforms. I may create a PR.
(force-pushed) 6f9a86e

P.S. The compilation fails on 10-12 versions.

Well, it seems my proposes solution doesn't work on PG 10-12 because the compilation fails. It seems, hash functions are not declared in the headers or the headers are different. It is sad that hash functions are implemented as "pg-functions", not as simple functions (they accept and return Datum). The hash operation may be called frequently. Thus, calling palloc to wrap float8 for hash calculation is not a good way on 32 bit platforms, I believe. Anyway, I propose to accept the PR and think about hash functions later in a separate Issue.

Would this proposed fix be slower on 64-bit machines at all? Not sure what the difference is with Float8GetDatum() on 64-bit.

I'm just wondering if we should do something like

#if IS64BIT /* theoretical - it's more complicated than this, just illustrating */ ... current code ... #else ... new version of code with Float8GetDatum() #endif

The proposed fix will not be slower on 64 bit platforms. I guess, It may be slightly faster, but insignificantly. On 64 bit platforms there is no difference, in general. My patch helps to fix the issue on 32 bit platform. Float8GetDatum uses palloc to pack float8 into Datum. I'm not sure that deallocation is happened until the end of the transaction, that may lead to huge memory consumption in case of huge number of hash calculations.

I'm not sure, we should use ifdef and create a different hash calculation logic. I would like to have the same calculation logic on all platforms. Furthermore, the hash calculation function takes float8 and returns uint32 types, which sizes are the same on both 32/64 bit platforms.

My proposed solution is not compiled on 12 or lesser versions. I think, that the original solution, proposed by @df7cb with some modifications (NAN, and +-0 processing) would be the better alternative It doesn't require some external headers.

@vitcpp wrote:

Float8GetDatum uses palloc to pack float8 into Datum.

Just to clarify, does Float8GetDatum() call palloc() on 64-bit or only on 32-bit?

Float8GetDatum calls palloc on 32 bit platforms because sizeof(Datum) = 4 is not enough to store 8 bytes of float8. There is the macro USE_FLOAT8_BYVAL that defines which version to use. For 32 bits it is 1.

If USE_FLOAT8_BYVAL is not defined then the following version is used:

Datum Float8GetDatum(float8 X) { float8 *retval = (float8 *) palloc(sizeof(float8)); *retval = X; return PointerGetDatum(retval); }

For 32 bits we may undef ifdef emulate palloc as a temporary solution. But I do not like temporary solutions.

uint32 pgs_hashfloat8(double key) { #ifdef 32BIT Datum datum = &key; uint32 hash = DirectFunctionCall1(hashfloat8, datum); #endif

df7cb · 2023-11-15T14:10:31Z

It would avoid the palloc call, true.

TBH, the current implementation is fast on 64-bit platforms, and anyone running database servers on 32-bit today should already be aware that there are limitations. Not sure we have to optimize for that.

vitcpp · 2023-11-20T13:43:29Z

Dear All, I'm going to increment the patch number and create a new release artifact. Let me know please if you have some objections.

vitcpp · 2023-11-22T04:52:50Z

Sorry, it seems I haven't merged the change before close. Now fixing it.

spherepoint_hash32: float8 needs wrapping into a Datum

7aab7a3

vitcpp approved these changes Nov 15, 2023

View reviewed changes

esabol approved these changes Nov 17, 2023

View reviewed changes

vitcpp closed this Nov 20, 2023

vitcpp reopened this Nov 22, 2023

vitcpp merged commit f229b2e into postgrespro:master Nov 22, 2023
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spherepoint_hash32: float8 needs wrapping into a Datum #107

spherepoint_hash32: float8 needs wrapping into a Datum #107

df7cb commented Nov 15, 2023

vitcpp commented Nov 15, 2023

df7cb commented Nov 15, 2023

vitcpp Nov 15, 2023

df7cb Nov 15, 2023

vitcpp Nov 15, 2023 •

edited

Loading

vitcpp Nov 15, 2023 •

edited

Loading

vitcpp Nov 15, 2023

esabol Nov 16, 2023

vitcpp Nov 16, 2023

esabol Nov 17, 2023 •

edited

Loading

vitcpp Nov 17, 2023 •

edited

Loading

vitcpp Nov 17, 2023

df7cb commented Nov 15, 2023

vitcpp commented Nov 20, 2023

vitcpp commented Nov 22, 2023

spherepoint_hash32: float8 needs wrapping into a Datum #107

spherepoint_hash32: float8 needs wrapping into a Datum #107

Conversation

df7cb commented Nov 15, 2023

vitcpp commented Nov 15, 2023

df7cb commented Nov 15, 2023

vitcpp Nov 15, 2023

Choose a reason for hiding this comment

df7cb Nov 15, 2023

Choose a reason for hiding this comment

vitcpp Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

vitcpp Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

vitcpp Nov 15, 2023

Choose a reason for hiding this comment

esabol Nov 16, 2023

Choose a reason for hiding this comment

vitcpp Nov 16, 2023

Choose a reason for hiding this comment

esabol Nov 17, 2023 • edited Loading

Choose a reason for hiding this comment

vitcpp Nov 17, 2023 • edited Loading

Choose a reason for hiding this comment

vitcpp Nov 17, 2023

Choose a reason for hiding this comment

df7cb commented Nov 15, 2023

vitcpp commented Nov 20, 2023

vitcpp commented Nov 22, 2023

vitcpp Nov 15, 2023 •

edited

Loading

vitcpp Nov 15, 2023 •

edited

Loading

esabol Nov 17, 2023 •

edited

Loading

vitcpp Nov 17, 2023 •

edited

Loading