Experimenting with different hash functions #992

prvyk · 2025-02-02T19:28:17Z

I'm not sure what's the motivation behind the current hash function. It's not a function I know (my CS education is missing a lot granted), it seems a bit odd (looking at half the array but adding in the length? Why 40343 and not another prime?). Maybe some cache locality thing? git blame says it's the same starting from the FASTER initial commit, and I couldn't find any documentation in the research PDFs about it. It reminds me of FNV but isn't FNV.

I've implemented FNV for comparison purposes. Running Resp.benchmark locally seems to show FNV has improved performance**. Maybe there's a better benchmark, or a different consideration for choosing this hash? [Edit: Doing actually working FNV was slower than current function, trying a different variant with just unchecked.]

The goal of this PR is not merging code, more asking a question while having code available for testing/benchmarking. Even if one assumes a different hash function would be better, there'll still be work testing all the options.

** FNV implementation uses unchecked. Merely adding unchecked to the existing function seems to improve performance a bit, but not as much as FNV while locally testing.

…urrent function.

Sometimes faster, sometimes slower, overall a tiny bit slower compared to unchecked function.

badrishc · 2025-02-02T22:20:50Z

Speed is important, but more important is how well/evenly the hash function distributes the keys in buckets. The current hash function used in Garnet is derived from research in my group from a long time back, and showed pretty even distribution for e.g., YCSB workloads, at high speeds. To see this, you would run FasterYcsbBenchmark in the FASTER repo and invoke the DumpDistribution method.

We're happy to consider alternatives if there are better options out there, along both the speed and even spread metrics.

Add fnv hash for benchmarking comparison

c7cfa23

prvyk marked this pull request as draft February 2, 2025 19:28

prvyk added 3 commits February 2, 2025 21:51

Another attempt, this time just with adding unchecked.

48a4980

Implemet FNV-32 right just for testing - this seems slower than the c…

1995394

…urrent function.

CityHash64

5154f03

Sometimes faster, sometimes slower, overall a tiny bit slower compared to unchecked function.

prvyk changed the title ~~[DRAFT] fnv hash for benchmarking comparison~~ Experimenting with different hash functions Feb 2, 2025

dotnet format

f1b6d52

prvyk added 2 commits February 3, 2025 00:33

Forgot the >64 case.

f7d5fac

dotnet format

41fd8ea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimenting with different hash functions #992

Experimenting with different hash functions #992

prvyk commented Feb 2, 2025 •

edited

Loading

badrishc commented Feb 2, 2025

Experimenting with different hash functions #992

Are you sure you want to change the base?

Experimenting with different hash functions #992

Conversation

prvyk commented Feb 2, 2025 • edited Loading

badrishc commented Feb 2, 2025

prvyk commented Feb 2, 2025 •

edited

Loading