Update (August 2021): My benchmark predates
WebAssembly and is therefore out-of-date. There likely exist hash functions
compiled into WebAssembly that beat the pure JS implementations. If somebody
wants to update this benchmark, a pull request to my benchmark repo would be most welcome!
I benchmarked various hashing algorithms, and here's the fastest options I found:
If you only need 32 bit digests, use iMurmurHash. Note that this will give you collisions after about 2**14 (16,000) hashes.
Use SparkMD5 if you need more than 32 bits. I didn't find a fast 64 or 128 bit Murmur implementation, but SparkMD5 was surprisingly fast (75 MB/sec).
- If you need incremental updates, consider joining strings into larger chunks before feeding them to SparkMD5, as SparkMD5's incremental hashing seems to suffer from some moderate overhead.
These recommendations are for pure JavaScript. I benchmarked them with strings, but SparkMD5 takes ArrayBuffers as well.
If you want fast hashing modules on Node, your best options are slightly different:
If you are hashing Buffers: Use the built-in crypto module with the MD5 algorithm.
The exception to this is: If you don't need incremental hashing, and you need more than 500 MB/sec throughput, and you're OK with a native npm dependency, use murmurhash-native for some extra performance. I didn't test digest sizes of less than 128 bit, as even with 128 bits the hashing is so fast that it's not likely to be a bottleneck.
(Note that murmurhash-native technically supports incremental hashing, but it's slower than Node's built-in MD5 algorithm in this mode.)
If you are hashing a single string non-incrementally, convert
it to a Buffer and see the preceding bullet point.
If you are hashing strings incrementally:
If you only need 32 bits, use iMurmurHash. Note that this will give you collisions after about 2**14 (16,000) hashes.
If you need more than 32 bits: Use the built-in crypto module with the MD5 algorithm.
- I also recommend that you experiment with joining strings together into larger chunks, since strings are implicitly converted to Buffers when you pass them to the crypto module, and each Buffer creation is quite slow. Performance will generally be bottlenecked by Buffer creation and string joining, as the native hashing function is very fast in comparison.