108

C++0x adds hash<...>(...).

I could not find a hash_combine function though, as presented in boost. What is the cleanest way to implement something like this? Perhaps, using C++0x xor_combine?

plasmacel
  • 8,183
  • 7
  • 53
  • 101
Neil G
  • 32,138
  • 39
  • 156
  • 257

8 Answers8

118

Well, just do it like the boost guys did it:

template <class T>
inline void hash_combine(std::size_t& seed, const T& v)
{
    std::hash<T> hasher;
    seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}
Karl von Moor
  • 8,484
  • 4
  • 40
  • 52
  • 28
    yeah, that's the best I could do too. I don't understand how the standards committee declined something so obvious. – Neil G Apr 07 '10 at 23:00
  • 15
    @Neil: I agree. I think a simple solution for them would be the requirement of the library to have a hash for `std::pair` (or `tuple`, even). It would compute the hash of each element, then combine them. (And in the spirit of the standard library, in an implementation defined way.) – GManNickG Apr 08 '10 at 06:06
  • 3
    There are a lot of obvious things omitted from the standard. The process of intensive peer review makes it difficult to get those little things out the door. – stinky472 Jun 27 '10 at 08:38
  • 16
    Why these magic numbers here? And isn't the above machine-dependent (e.g. won't it be different on x86 and x64 platforms)? – einpoklum Jan 08 '14 at 16:49
  • 3
    There's a paper suggesting the inclusion of hash_combine [here](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3333.html) – SSJ_GZ Jan 12 '14 at 16:35
  • 2
    Is there something like this in std lib, or an coming version of the std lib? – allyourcode May 27 '14 at 22:25
  • 4
    I guess a good combining method needs the knowledge how the individual parts are hashed... some hash methods can have problems with certain combiners. That's just my educated guess... if it's true, it's hard to see how you could standardize this in a sensible way. – Karoly Horvath Sep 17 '14 at 16:28
  • 3
    @einpoklum: It doesn't matter if it's different. Usually a hash function is only consistent for a given process. See the link in SSJ_GZ's comment. – Neil G Sep 08 '15 at 03:48
  • 2
    Perhaps the standards committee declined this because it is not, in fact, a very good general way of combining hashes? Perhaps there is no universal good way, so standardizing one way would be against fundamental policies of the standard library? – Raedwald Dec 05 '17 at 22:54
  • @Raedwald, that's a bit of an exageration... Just look at the implementations... One that I know of, performs the hash of all basic types by combining the hashes of all bytes with an algorithm that is a bit weaker than this one. Still, it is useful, we just want to declare an unordered_set. In this case, all we want is an unordered_set< pair > or something, there's no need to get cryptographic about this. – migle Jan 08 '18 at 17:48
  • 1
    See Yakk's answer to [Why is XOR the default way to combine hashes](https://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-to-combine-hashes) for an explanation of this algorithm and a small improvement for 64-bits systems in the comments. – migle Jan 08 '18 at 17:55
49

I'll share it here since it can be useful to others looking for this solution: starting from @KarlvonMoor answer, here's a variadic template version, which is terser in its usage if you have to combine several values together:

inline void hash_combine(std::size_t& seed) { }

template <typename T, typename... Rest>
inline void hash_combine(std::size_t& seed, const T& v, Rest... rest) {
    std::hash<T> hasher;
    seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
    hash_combine(seed, rest...);
}

Usage:

std::size_t h=0;
hash_combine(h, obj1, obj2, obj3);

This was written originally to implement a variadic macro to easily make custom types hashable (which I think is one of the primary usages of a hash_combine function):

#define MAKE_HASHABLE(type, ...) \
    namespace std {\
        template<> struct hash<type> {\
            std::size_t operator()(const type &t) const {\
                std::size_t ret = 0;\
                hash_combine(ret, __VA_ARGS__);\
                return ret;\
            }\
        };\
    }

Usage:

struct SomeHashKey {
    std::string key1;
    std::string key2;
    bool key3;
};

MAKE_HASHABLE(SomeHashKey, t.key1, t.key2, t.key3)
// now you can use SomeHashKey as key of an std::unordered_map
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Why is the seed always bitshifted by 6 and 2, respectively? – j00hi Jul 04 '19 at 08:35
  • @j00hi It's the algorithm used by Boost. https://www.boost.org/doc/libs/1_35_0/doc/html/boost/hash_combine_id241013.html. That's a good starting point for research. – Stewart Smith Jul 11 '21 at 19:52
  • The Boos docs seem to be very silent on why it was implemented in this particular way. See https://stackoverflow.com/q/35985960/3907364 instead – Raven Dec 13 '22 at 14:23
10

I really like the C++17 approach from the answer by vt4a2h, however it suffers from a problem: The Rest is passed on by value whereas it would be more desirable to pass them on by const references (which is a must if it shall be usable with move-only types).

Here is the adapted version which still uses a fold expression (which is the reason why it requires C++17 or above) and uses std::hash (instead of the Qt hash function):

template <typename T, typename... Rest>
void hash_combine(std::size_t& seed, const T& v, const Rest&... rest)
{
    seed ^= std::hash<T>{}(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
    (hash_combine(seed, rest), ...);
}

For completeness sake: All the types which shall be usable with this version of hash_combine must have a template specialization for hash injected into the std namespace.

Example:

namespace std // Inject hash for B into std::
{
    template<> struct hash<B>
    {
        std::size_t operator()(B const& b) const noexcept
        {
            std::size_t h = 0;
            cgb::hash_combine(h, b.firstMember, b.secondMember, b.andSoOn);
            return h;
        }
    };
}

So that type B in the example above is also usable within another type A, like the following usage example shows:

struct A
{
    std::string mString;
    int mInt;
    B mB;
    B* mPointer;
}

namespace std // Inject hash for A into std::
{
    template<> struct hash<A>
    {
        std::size_t operator()(A const& a) const noexcept
        {
            std::size_t h = 0;
            cgb::hash_combine(h,
                a.mString,
                a.mInt,
                a.mB, // calls the template specialization from above for B
                a.mPointer // does not call the template specialization but one for pointers from the standard template library
            );
            return h;
        }
    };
}
j00hi
  • 5,420
  • 3
  • 45
  • 82
  • 1
    In my opinion it is nicer to use the `Hash` template arguments of the standard containers to specify your custom hasher instead of injecting it into the `std` namespace. – Henri Menke Dec 18 '19 at 08:54
7

A few days ago I came up with slightly improved version of this answer (C++ 17 support is required):

template <typename T, typename... Rest>
void hashCombine(uint& seed, const T& v, Rest... rest)
{
    seed ^= ::qHash(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
    (hashCombine(seed, rest), ...);
}

The code above is better in terms of code generation. I used qHash function from Qt in my code, but it's also possible to use any other hashers.

vt4a2h
  • 179
  • 1
  • 6
  • Write the fold expression as `(int[]){0, (hashCombine(seed, rest), 0)...};` and it'll also work in C++11. – Henri Menke Dec 18 '19 at 08:41
  • what's the point of using the fold expression if you use the recursive unroll anyway? And why do you believe this is better (than what?) in terms of code generation? – Slava Aug 18 '22 at 15:03
  • The main point is to reduce code generation. This is can be easily checked by using godbolt and/or cppinsights. – vt4a2h Aug 19 '22 at 20:39
  • Right, I think I should clarify it a little bit more. In the original answer will be generated as many hashCombine as the number of arguments you have passed. For the option I suggested will be generated as many functions as the number of types of arguments you have +1. If you have at least two arguments of the same type, my option is better in terms of code generation. Of course, we're not considering simple cases when a compiler can optimize it, or even calculate hash in the compile time. – vt4a2h Aug 19 '22 at 20:54
6

The answer by vt4a2h is certainly nice but uses the C++17 fold expression and not everyone is able to switch to a newer toolchain easily. The version below uses the expander trick to emulate a fold expression and works in C++11 and C++14 as well.

Additionally, I marked the function inline and use perfect forwarding for the variadic template arguments.

template <typename T, typename... Rest>
inline void hashCombine(std::size_t &seed, T const &v, Rest &&... rest) {
    std::hash<T> hasher;
    seed ^= hasher(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
    (int[]){0, (hashCombine(seed, std::forward<Rest>(rest)), 0)...};
}

Live example on Compiler Explorer

Henri Menke
  • 10,705
  • 1
  • 24
  • 42
  • Looks much better, thank you! I probably didn't care about passing by value, because I used some implicitly shared objects, for example, like QString. – vt4a2h Dec 18 '19 at 14:07
5

This could also be solved by using a variadic template as follows:

#include <functional>

template <typename...> struct hash;

template<typename T> 
struct hash<T> 
    : public std::hash<T>
{
    using std::hash<T>::hash;
};


template <typename T, typename... Rest>
struct hash<T, Rest...>
{
    inline std::size_t operator()(const T& v, const Rest&... rest) {
        std::size_t seed = hash<Rest...>{}(rest...);
        seed ^= hash<T>{}(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
        return seed;
    }
};

Usage:

#include <string>

int main(int,char**)
{
    hash<int, float, double, std::string> hasher;
    std::size_t h = hasher(1, 0.2f, 2.0, "Hello World!");
}

One could certainly make a template function, but this could cause some nasty type deduction e.g hash("Hallo World!") will calculate a hash value on the pointer rather than on the string. This is probably the reason, why the standard uses a struct.

kiloalphaindia
  • 561
  • 3
  • 7
1

The answer by Henri Menke works great, but if you treat warnings as errors with for example:

add_compile_options(-Werror)

GCC 9.3.0 will give this error:

Test.h:223:67: error: ISO C++ forbids compound-literals [-Werror=pedantic]
  223 |     (int[]){0, (hashCombine(seed, std::forward<Rest>(rest)), 0)...};
      |                                                                  ^
cc1plus: all warnings being treated as errors

We can update the code to avoid the error like this:

template <typename T, typename... Rest>
inline void hashCombine(std::size_t &seed, T const &v, Rest &&... rest) {
    std::hash<T> hasher;
    seed ^= (hasher(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2));
    int i[] = { 0, (hashCombine(seed, std::forward<Rest>(rest)), 0)... };
    (void)(i);
}
Johnex
  • 11
  • 1
1

You can use the rst C++ library that I developed to do that:

#include "rst/stl/hash.h"

struct Point {
  Point(const int x, const int y) : x(x), y(y) {}

  int x = 0;
  int y = 0;
};

bool operator==(const Point lhs, const Point rhs) {
  return (lhs.x == rhs.x) && (lhs.y == rhs.y);
}

namespace std {

template <>
struct hash<Point> {
  size_t operator()(const Point point) const {
    return rst::HashCombine({point.x, point.y});
  }
};

}