Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spherepoint_hash32: float8 needs wrapping into a Datum #107

Merged
merged 1 commit into from
Nov 22, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/point.c
Original file line number Diff line number Diff line change
Expand Up @@ -315,8 +315,8 @@ Datum
spherepoint_hash32(PG_FUNCTION_ARGS)
{
SPoint *p1 = (SPoint *) PG_GETARG_POINTER(0);
Datum h1 = DirectFunctionCall1(hashfloat8, p1->lat);
Datum h2 = DirectFunctionCall1(hashfloat8, p1->lng);
Datum h1 = DirectFunctionCall1(hashfloat8, Float8GetDatum(p1->lat));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code will work, but I think it will significantly decrease the performance on 32 bit platforms. It is ok to fix the fails on 32 bit, but the function should be improved. We have to create a new Issue for this problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how 64-bit floats are supposed to be handled on 32-bit platforms, what would you want to change there?

Copy link
Contributor

@vitcpp vitcpp Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be something like this? I haven't compiled it yet.

uint32 pgs_hashfloat8(double key)
{
	/*
	 * On IEEE-float machines, minus zero and zero have different bit patterns
	 * but should compare as equal.  We must ensure that they have the same
	 * hash value, which is most reliably done this way:
	 */
	if (key == (float8) 0)
		PG_RETURN_UINT32(0);

	/*
	 * Similarly, NaNs can have different bit patterns but they should all
	 * compare as equal.  For backwards-compatibility reasons we force them to
	 * have the hash value of a standard NaN.
	 */
	if (isnan(key))
		key = get_float8_nan();

	return hash_bytes((unsigned char *) &key, sizeof(key));
}

void spherepoint_hash32()
{
	SPoint	   *p1 = (SPoint *) PG_GETARG_POINTER(0);
	const uint32 h1 = pgs_hashfloat8(p1->lng);
	const uint32 h2 = pgs_hashfloat8(p1->lat);
	...
}

Copy link
Contributor

@vitcpp vitcpp Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work on 64 bit platforms. I may create a PR.
(force-pushed) 6f9a86e

P.S. The compilation fails on 10-12 versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it seems my proposes solution doesn't work on PG 10-12 because the compilation fails. It seems, hash functions are not declared in the headers or the headers are different. It is sad that hash functions are implemented as "pg-functions", not as simple functions (they accept and return Datum). The hash operation may be called frequently. Thus, calling palloc to wrap float8 for hash calculation is not a good way on 32 bit platforms, I believe. Anyway, I propose to accept the PR and think about hash functions later in a separate Issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this proposed fix be slower on 64-bit machines at all? Not sure what the difference is with Float8GetDatum() on 64-bit.

I'm just wondering if we should do something like

#if IS64BIT /* theoretical - it's more complicated than this, just illustrating */
    ... current code ...
#else
    ... new version of code with Float8GetDatum()
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed fix will not be slower on 64 bit platforms. I guess, It may be slightly faster, but insignificantly. On 64 bit platforms there is no difference, in general. My patch helps to fix the issue on 32 bit platform. Float8GetDatum uses palloc to pack float8 into Datum. I'm not sure that deallocation is happened until the end of the transaction, that may lead to huge memory consumption in case of huge number of hash calculations.

I'm not sure, we should use ifdef and create a different hash calculation logic. I would like to have the same calculation logic on all platforms. Furthermore, the hash calculation function takes float8 and returns uint32 types, which sizes are the same on both 32/64 bit platforms.

My proposed solution is not compiled on 12 or lesser versions. I think, that the original solution, proposed by @df7cb with some modifications (NAN, and +-0 processing) would be the better alternative It doesn't require some external headers.

Copy link
Contributor

@esabol esabol Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vitcpp wrote:

Float8GetDatum uses palloc to pack float8 into Datum.

Just to clarify, does Float8GetDatum() call palloc() on 64-bit or only on 32-bit?

Copy link
Contributor

@vitcpp vitcpp Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Float8GetDatum calls palloc on 32 bit platforms because sizeof(Datum) = 4 is not enough to store 8 bytes of float8. There is the macro USE_FLOAT8_BYVAL that defines which version to use. For 32 bits it is 1.

If USE_FLOAT8_BYVAL is not defined then the following version is used:

Datum
Float8GetDatum(float8 X)
{
	float8	   *retval = (float8 *) palloc(sizeof(float8));

	*retval = X;
	return PointerGetDatum(retval);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 32 bits we may undef ifdef emulate palloc as a temporary solution. But I do not like temporary solutions.

uint32 pgs_hashfloat8(double key)
{
#ifdef 32BIT
    Datum datum = &key;
    uint32 hash = DirectFunctionCall1(hashfloat8, datum);
#endif

Datum h2 = DirectFunctionCall1(hashfloat8, Float8GetDatum(p1->lng));

PG_RETURN_INT32(DatumGetInt32(h1) ^ DatumGetInt32(h2));
}