6

Rust's default HashMap hasher is SipHash, which is considerably slower in some cases (for integers, for example), but it provides a HashDoS protection. As described here, the NoHashHasher can be like 50 times faster than the default hasher and 10-20 times faster than FNV and FX hashers (not always, of course, but in some cases like the one described there).

In my personal experience, I'd very rarely need the extra protection agains HashDoS attacks. For example, if I put database IDs as keys and those IDs were autoincrement-based, I really don't see much of a risk there. Nonetheless, I understand there is a good reason to have a safer hasher as the default (better safe than sorry).

So when we know we are not exposed to any risk for HashDoS attacks and need to use 64-bit or smaller integers as keys, are there any reasons to refrain from using NoHashHasher? Obviously, performance will not always be better and it's clear in certain cases it might even perform worse than FNV/FX hashers. I am specifically asking about non-obvious risks, downsides or special rules to follow in situations like the one described in the linked question, where it's clear NoHashHasher shows an excellent performance.

at54321
  • 8,726
  • 26
  • 46
  • "if I put database IDs as keys and those IDs were autoincrement-based, I really don't see much of a risk there" well of course it's should be secured even if something "guess" an id, but this allow brute force or scraping more easily. – Stargateur Jan 02 '22 at 10:39
  • 2
    Note that you're conflating two issues here: HashDOS resistance (which is one of the features of keyed hashes) and *collision avoidance*: it's highly desirable to avoid collisions between different entries. Hash functions "randomise" the entries (at least assuming the function is good), which limits collisions, and thus avoids the costs of collision resolution. – Masklinn Jan 02 '22 at 13:13
  • That said, using identity hashing for integers is relatively common as this can offer interesting clustering (and thus performance) properties. Python, Java, and C# use an "identity" hash function for integers, for instance. Ruby, on the other hand, does not. – Masklinn Jan 02 '22 at 13:17
  • @Stargateur This is a really long topic and many cases can be discussed. I believe in _most_ cases autoincrement IDs are totally fine, security-wise. So I think that's a pretty reasonable use case from a practical point of view. – at54321 Jan 02 '22 at 14:12
  • @Masklinn True, but also true is that this risk shouldn't be exaggerated. A huge portion of code written today uses identity hash functions and in the vast majority of cases they work very well. In the past, out of curiosity, I tested various huge real-world data sets for collisions, etc. and in all of them the distribution with identity hash functions was excellent. That, of course, doesn't mean there aren't cases that can result in collisions, but my point is that's rather rare. – at54321 Jan 02 '22 at 14:33
  • If you don't need the protection, and `NoHashHasher` shows good perf, what another reason do you have to not use it? – Chayim Friedman Jan 03 '22 at 08:07

0 Answers0