Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use may_alias with unaligned reads to fix miscompilation on GCC #1013

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

lassipulkkinen
Copy link

When XXH_FORCE_MEMORY_ACCESS==1, which is the default on supported compilers unlike stated in the README, a strict aliasing violation occurs in XXH_read64, resulting in miscompilation on GCC (but not Clang) in some oddly specific circumstances.

The following code reproduces the problem on x86_64 GCC 14.2.1 when compiled with -O3:

#define XXH_INLINE_ALL

#include <inttypes.h>
#include <stdio.h>
#include <xxhash.h>

int main() {
	// it seems this has to be exactly 24 bytes.
	union {
		char x[24];
		// force 8-byte alignment without making
		// aliasable with uint64_t.
		void *y[3];
	} data = {.x = "garblegarblegarblegarble"};
	uint64_t hash = XXH64(&data, sizeof(data), 0);
	printf("%016"PRIx64"\n", hash);
	return 0;
}

A bogus -Wuninitialized warning is produced if enabled, and the resulting program outputs an incorrect hash.

While this definitely looks like a compiler bug in some ways, I see no reason to assume aligned(1) alone should excempt the type from aliasing restrictions regardless, and adding may_alias does fix the problem.

The separate strict aliasing bug when XXH_FORCE_ALIGN_CHECK==1, discussed in #383, still remains.

When XXH_FORCE_MEMORY_ACCESS==1, which is the default on supported
compilers unlike stated in the README, a strict aliasing violation
occurs in XXH_read64, resulting in miscompilation on GCC (but not Clang)
in some oddly specific circumstances.

The following code reproduces the problem on x86_64 GCC 14.2.1 when
compiled with -O3:

	#define XXH_INLINE_ALL

	#include <inttypes.h>
	#include <stdio.h>
	#include <xxhash.h>

	int main() {
		// it seems this has to be exactly 24 bytes.
		union {
			char x[24];
			// force 8-byte alignment without making
			// aliasable with uint64_t.
			void *y[3];
		} data = {.x = "garblegarblegarblegarble"};
		uint64_t hash = XXH64(&data, sizeof(data), 0);
		printf("%016"PRIx64"\n", hash);
		return 0;
	}

A bogus -Wuninitialized warning is produced if enabled, and the
resulting program outputs an incorrect hash.

While this definitely looks like a compiler bug in some ways, I see no
reason to assume aligned(1) alone should excempt the type from aliasing
restrictions regardless, and adding may_alias does fix the problem.

The separate strict aliasing bug when XXH_FORCE_ALIGN_CHECK==1,
discussed in Cyan4973#383, still remains.
@Cyan4973 Cyan4973 self-assigned this Feb 14, 2025
@Cyan4973
Copy link
Owner

Cyan4973 commented Feb 20, 2025

XXH64() exposes a void* pointer as input argument,
so it should be enough to allow aliasing.

However, with inlining via XXH_INLINE_ALL, I guess the compiler can pass through this interface layer, and access directly the code below it, skipping the void* intermediate stage.
I don't know C well enough to tell if the compiler is allow to do that (i.e. ignore void* contract), or is well within its right.
Either way, I can see how this can be an immediate problem.

The proposed fix is simple enough, and it should do exactly as advertised, i.e. allow aliasing, even when inlining.

That being said, this issue has not been exposed in CI so far, showing that we probably need to amend the test suite to produce the problem, observe its fix, and ensure the property remains guaranteed over time.

Since you also provide a test case in the issue description, I'll try to use that as a basis for the new test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants