Might also try running it through Smhasher3: https://gitlab.com/fwojcik/smhasher3/-/blob/main/results/REA...
Also here is a hash function I wrote a while ago now: https://github.com/Keith-Cancel/k-hashv/tree/main
While the benefit of processing chunks of 8 bytes is obvious, what's the purpose of grouping those into macrogroups of 4? Does it trigger any implicit parallelism I failed to spot? Or is it just to end this phase with the 4 h[] having had the same amount of entropy, and thus starting the next one with h[0]?
> The way it loads the 8 bytes is also important. The correct way is to load via shift+or > This is free of any UB, works on any alignment and on any machine regardless of it's endianness. It's also fast, gcc and clang recognize this pattern and optimize it into a single mov instruction on x86 targets.
Is a single MOV instruction still fast when the 8 bytes begin on an odd address?
XXH3 omitted from benchmark?
Someone just ported this implementation to Rust. :)
How does it compare to CRC64, for the purpose of detecting errors?
See also: “Meow hash” by the legendary Casey Muratori, a fast non-cryptographic hash function: https://github.com/cmuratori/meow_hash
> It’s the fastest hash function we know of, and we have benchmarked all the ones we could find. On modern Intel x64 CPUs, it hashes 16 bytes per cycle single-threaded. This means in cache it can hash at a rate of 64 gigabytes per second on a 4.2gHz machine. Out of cache, it hashes at whatever speed your main memory bus can provide to a single core, since that is usually the limiting factor on modern x64 CPUs.
> It has also now been tuned to be the fastest hash on small inputs, too. Despite the fact that it is a full 128-bit hash, it still outperforms “fast” 64-bit hashes across all input sizes.
— https://archive.is/CQOVm (originally https://mollyrocket.com/meowhash)
Discussion on HN: https://news.ycombinator.com/item?id=29038813 & https://news.ycombinator.com/item?id=18262627
[dead]
SMHasher/Murmurhash author here - I don't see anything fundamentally wrong with this hash function, it uses basically the same operations as the Murmur family (and a lot of other hashes at this point).
The handling of the "tail" of the key (the last < 32 bytes) is slightly odd (the "if (l & 1) { mix 1 byte of key }" happens before 8-byte chunks and 2-byte chunks), but if it passes SMHasher it should be fine for general use.