2

Looking for guidance on whether I should just use an integer key as the hash value itself, or convert it into a different value, I came across the following advice on this SO question:

Advice from the answers suggest using a multiplier to convert the integral input value

In general, you should pick a multiplier that is in the order of your hash size and has no common factors with it. This way the hash function covers all your hash space uniformly.

and

I found the following algorithm provides a very good statistical distribution. Each input bit affects each output bit with about 50% probability.

However, I then came across another SO question

The answer refers to the GCC implementation, stating the default std::hash for integers just returns the bit pattern

As for the implementation in GCC, the specialization for builtin-types just returns the bit pattern. Here's how they are defined in bits/functional_hash.h:

size_t
operator()(_Tp __val) const noexcept
{ return static_cast<size_t>(__val); }

I checked my version (gcc-5.4.1) and can confirm this is the case.

So my question is:

For an integral key of type std::int64_t, should I be using the implementation provided by the standard library, or should I be using an alternative, such as the function suggested by Thomas Mueller in the 1st link above?

uint64_t hash(uint64_t x) {
    x = (x ^ (x >> 30)) * UINT64_C(0xbf58476d1ce4e5b9);
    x = (x ^ (x >> 27)) * UINT64_C(0x94d049bb133111eb);
    x = x ^ (x >> 31);
    return x;
}
Steve Lorimer
  • 27,059
  • 17
  • 118
  • 213
  • 4
    Try both, measure, and pick the one that performs the best. – NathanOliver Jul 13 '17 at 15:28
  • check out an answer of mine here: https://stackoverflow.com/questions/44668672/hash-function-to-size-of-buckets-for-unordered-containers. If I'm not mistaken, gcc's unordered_map uses a prime table size, so a not-too-good hash function could be adequate. Even the identity relation could be fine. – geza Jul 13 '17 at 15:47
  • But of course, you should not rely on this information too much. A different version/compiler could use a different method (that's why I created my own hashtable implementation: I can rely on it) – geza Jul 13 '17 at 15:57
  • @geza absolutely, thanks, but it is still interesting to understand the mechanics of how the hash function plays with the bucket size – Steve Lorimer Jul 13 '17 at 15:58
  • One thing you should take into consideration is whether your `unordered_map` needs to withstand hash collision flood attacks – Passer By Jul 13 '17 at 16:03

0 Answers0