Generic hash for tuples in unordered_map / unordered_set

Question

Why doesn't std::unordered_map<tuple<int, int>, string> just work out of the box? It is tedious to have to define a hash function for tuple<int, int>, e.g.

template<> struct do_hash<tuple<int, int>>                               
{   size_t operator()(std::tuple<int, int> const& tt) const {...}  };

Building an unordered map with tuples as keys (Matthieu M.) shows how to automate this for boost::tuple. Is there anyway of doing this for c++0x tuples without using variadic templates?

Surely this should be in the standard :(

If you're using C++0x why not use variadic templates? (or is there an implementation with tuples, but no varidic templates?) — R. Martinho Fernandes, Aug 18 '11 at 15:59
@RMartinho : VC++ 2010 matches that description (and it seems the next version of VC++ is likely to as well). — ildjarn, Aug 18 '11 at 20:39
If std::tuple is implemented by manually writing out templates of 1 to N-arity (like boost::tuple), then I guess the std::hash specialisation necessary to hash the tuples will also have to be manually written out (using too clever preprocessing?) the same way. Aaargh! — Leo Goodstadt, Aug 18 '11 at 23:02

Leo Goodstadt · Accepted Answer · 2014-05-20T11:37:17.077

This works on gcc 4.5 allowing all c++0x tuples containing standard hashable types to be members of unordered_map and unordered_set without further ado. (I put the code in a header file and just include it.)

The function has to live in the std namespace so that it is picked up by argument-dependent name lookup (ADL).

Is there a simpler solution?

#include <tuple>
namespace std{
    namespace
    {

        // Code from boost
        // Reciprocal of the golden ratio helps spread entropy
        //     and handles duplicates.
        // See Mike Seymour in magic-numbers-in-boosthash-combine:
        //     http://stackoverflow.com/questions/4948780

        template <class T>
        inline void hash_combine(std::size_t& seed, T const& v)
        {
            seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
        }

        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              

    };
}

Standard Conformant code

Yakk points out that specialising things in the std namespace is actually undefined behaviour. If you wish to have a standards conforming solution, then you need to move all of this code into your own namespace and give up any idea of ADL finding the right hash implementation automatically. Instead of :

unordered_set<tuple<double, int> > test_set;

You need:

unordered_set<tuple<double, int>, hash_tuple::hash<tuple<double, int>>> test2;

where hash_tuple is your own namespace rather than std::.

To do this, you first have to declare a hash implementation inside the hash_tuple namespace. This will forward all non tuple types to the std::hash:

namespace hash_tuple{

template <typename TT>
struct hash
{
    size_t
    operator()(TT const& tt) const
    {                                              
        return std::hash<TT>()(tt);                                 
    }                                              
};
}

Make sure that hash_combine calls hash_tuple::hash and not std::hash

namespace hash_tuple{

namespace
    {
    template <class T>
    inline void hash_combine(std::size_t& seed, T const& v)
    {
        seed ^= hash_tuple::hash<T>()(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
    }
}

Then include all the other previous code but put it inside namespace hash_tuple and not std::

namespace hash_tuple{

    namespace
    {
        // Recursive template code derived from Matthieu M.
        template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>
        struct HashValueImpl
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            HashValueImpl<Tuple, Index-1>::apply(seed, tuple);
            hash_combine(seed, std::get<Index>(tuple));
          }
        };

        template <class Tuple>
        struct HashValueImpl<Tuple,0>
        {
          static void apply(size_t& seed, Tuple const& tuple)
          {
            hash_combine(seed, std::get<0>(tuple));
          }
        };
    }

    template <typename ... TT>
    struct hash<std::tuple<TT...>> 
    {
        size_t
        operator()(std::tuple<TT...> const& tt) const
        {                                              
            size_t seed = 0;                             
            HashValueImpl<std::tuple<TT...> >::apply(seed, tt);    
            return seed;                                 
        }                                              
    };

}

Assuming you don't want to use boost libraries shouldn't you have std::hash_combine instead of boost::hash_combine? — Bo Lu, Jul 30 '12 at 19:30
Your template compiled using std::hash_combine with gcc 4.6.3. I made the switch to avoid using boost — Bo Lu, Apr 18 '13 at 14:36
@LeoGoodstadt: +1. Your hash is almost certainly better than bare XOR. I've been told that `seed += hash(v); seed *= m`, where *m* is a large odd number, is a good choice, but I don't know whether its better or worse than yours. — Marcelo Cantos, Apr 24 '13 at 21:23
@Bo Lu: I was wondering what you were talking about since I include the hash_combine explicitly to avoid boost dependencies. Then I noticed I forgot to remove the boost namespace qualifier from my code... Hope it still works. — Leo Goodstadt, May 16 '13 at 18:12
Not worth the undefined behavior: don't specialize things in `std::` involving things you do not own, and you don't own `std::tuple`. As a concrete exmaple of how this could horribly break code, what happens when a new iteration of the standard introduces its own hash specialization? What happens when someone else who has your bright idea introduces a narrow `hash>` specialization, which is visible from some but not all places where `hash>` is used? Those are concrete examples, but the UB is not bounded by them. Your program is ill formed. — Yakk - Adam Nevraumont, Mar 04 '14 at 16:13
Thanks @Yakk. Answer updated to suggest avoiding the `std::` namespace. — Leo Goodstadt, May 20 '14 at 11:39
Doesn't touching std result in undefined behavior? Does using an anon namespace in std get around that?? — allyourcode, May 27 '14 at 21:55
@allyourcode This is old, but adding specializations to the std namespace is actually recommended. Adding classes, functions, or other definitions is explicitly forbidden. http://en.cppreference.com/w/cpp/language/extending_std — Alex Huszagh, Apr 29 '17 at 21:41
@AlexanderHuszagh You can add specializations into `std` namespace only for custom types (therefore not for `std::tuple`). This is what Yakk mentioned in his/her comment. — Daniel Langr, Feb 01 '18 at 13:57
@DanielLangr Neither mine nor @Yakk's comments are contradictory. You are expressly encouraged to add specializations in the `std` namespace, doing so for types already in the standard library is bound to cause problems. — Alex Huszagh, Feb 02 '18 at 02:46
rather than having `hash_tuple::hash` be a template, you could have a single `hash_tuple` object with a template `operator()` — Caleth, Jan 15 '21 at 11:11

Вова · Answer 2 · 2013-03-26T12:02:02.683

14

#include <boost/functional/hash.hpp>
#include <tuple>

namespace std
{

template<typename... T>
struct hash<tuple<T...>>
{
    size_t operator()(tuple<T...> const& arg) const noexcept
    {
        return boost::hash_value(arg);
    }
};

}

edited Mar 26 '13 at 12:02

answered Mar 26 '13 at 11:50

Вова

309
2
8

score 10 · Answer 3 · answered Mar 12 '19 at 02:56

With C++20, it is possible to use fold expressions and generic lambdas to compute hash of a tuple without recursion. I prefer to rely on std::hash<uintmax_t> instead of manually combining hashes:

#include <cinttypes>
#include <cstddef>
#include <functional>
#include <tuple>

class hash_tuple {
    template<class T>
    struct component {
        const T& value;
        component(const T& value) : value(value) {}
        uintmax_t operator,(uintmax_t n) const {
            n ^= std::hash<T>()(value);
            n ^= n << (sizeof(uintmax_t) * 4 - 1);
            return n ^ std::hash<uintmax_t>()(n);
        }
    };

public:
    template<class Tuple>
    size_t operator()(const Tuple& tuple) const {
        return std::hash<uintmax_t>()(
            std::apply([](const auto& ... xs) { return (component(xs), ..., 0); }, tuple));
    }
};

- 1 in sizeof(uintmax_t) * 4 - 1 is optional, but appears to slightly improve hash distribution. This class can be used both with std::tuple and std::pair.

Note that the line "n ^ std::hash()(n);" will only return 0 on platforms where the default hash function is identity, making this hash function useless. — wecx, Mar 30 '22 at 19:46
Usage of ```operator,``` vs. ```operator+``` or ```operator^```, the latter two seem more appropriate of course subjectively speaking. — user3882729, Sep 05 '22 at 03:01

Mooing Duck · Answer 4 · 2014-03-04T17:13:57.660

10

In my C++0x draft, 20.8.15 says hash is specialized for built-in types (including pointers, but doesn't seem to imply dereferencing them). It also appears to be specialized for error_code, bitset<N>, unique_ptr<T, D>, shared_ptr<T>, typeindex, string, u16string, u32string, wstring, vector<bool, Allocator>, and thread::id. (facinating list!)

I've not used C++0x variadics, so my formatting is probably way off, but something along these lines might work for all tuples.

size_t hash_combiner(size_t left, size_t right) //replacable
{ return left + 0x9e3779b9 + (right<<6) + (right>>2);}

template<int index, class...types>
struct hash_impl {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<index, std::tuple<types...>>::type nexttype;
        hash_impl<index-1, types...> next;
        size_t b = std::hash<nexttype>()(std::get<index>(t));
        return next(hash_combiner(a, b), t); 
    }
};
template<class...types>
struct hash_impl<0, types...> {
    size_t operator()(size_t a, const std::tuple<types...>& t) const {
        typedef typename std::tuple_element<0, std::tuple<types...>>::type nexttype;
        size_t b = std::hash<nexttype>()(std::get<0>(t));
        return hash_combiner(a, b); 
    }
};

template<class...types>
struct tuple_hash<std::tuple<types...>> {
    size_t operator()(const std::tuple<types...>& t) {
        const size_t begin = std::tuple_size<std::tuple<types...>>::value-1;
        return hash_impl<begin, types...>()(0, t);
    }
}

This version actually compiles and runs

Yakk has observed that specializing std::hash directly is technically not allowed, since we're specializing a standard library template with a declaration that does not depend on a user-defined type.

edited Mar 04 '14 at 17:13

answered Aug 18 '11 at 17:19

Mooing Duck

64,318
19
100
158

1

Use `left() ^ right()` if you don't know what to do. See http://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-to-combine-hashes . But note that XOR is not always the right choice. Using plain addition can turn out to be better if you expect the tuples to contain duplicate members. This is probably the reason why there is no standard hash for tuples. – Alexandre C. Aug 18 '11 at 17:23
3

It must be an oversight that tuple is not hashable. Too late for the standard though :( – Leo Goodstadt Aug 18 '11 at 23:03
@Leo oddly enough, no container except for vector, and basic_string is hashable. (is bitset technically a container?) – Mooing Duck Aug 18 '11 at 23:40
oh, I just now noticed: `without using variadic templates?` whoops. The answer is no, can't be done for all tuples, since a tuple is a variadic template type. – Mooing Duck Aug 18 '11 at 23:43
@Alexandre Turns out this revised code looks almost exactly like my answer (after reformatting for whitespace) except that (1) I use a different (more robust?) way of dealing with entropy when combining hash values and (2) I don't have to deduce the type of the "next" tuple value explicitly using std::tuple_element because my hash_combine function is parameterised on its type. – Leo Goodstadt Apr 18 '13 at 11:05
4

@AlexandreC.: `^` and `+` are both commutative, and are thus poor choices for combining hashes. Consider how `std::unordered_set>` would handle permutations of {1, 2, …, 10}. Instead, use a non-commutative combiner such as `m * left + right`, where *m* is a large odd number. – Marcelo Cantos Apr 24 '13 at 21:19
Specializing `hash` for a type you do not own means your program is ill formed. And you don't own all of `hash>`. And `^` is a horrible hash combiner. – Yakk - Adam Nevraumont Mar 04 '14 at 16:14

score 2 · Answer 5 · answered Sep 21 '22 at 12:54

2

If you want to do it in a simple way. Just do

std::unordered_map<std::tuple<int, int>, std::string, boost::hash<std::tuple<int, int>>> mp;

answered Sep 21 '22 at 12:54

Ravi Kumar Yadav

193
2
15

Generic hash for tuples in unordered_map / unordered_set

5 Answers5

Standard Conformant code

Linked

Related