10

I'm guessing std::hash is defined as a template struct in order to avoid implicit type conversions done during overloaded function resolution. Is it a correct thing to say?

I mean, I would prefer to write

std::string s;
size_t hash = std::hash(s);

instead of

std::string s;
size_t hash = std::hash<std::string>()(s);

But I'm guessing there's a reason for why the standards committee has chosen the second option.

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
Alexei Sholik
  • 7,287
  • 2
  • 31
  • 41

2 Answers2

13

It's impossible to partially specialize function template, and so for user-defined templated classes, the would be no way to specialize std::hash if it was a function. (you can only specialize templates from std namespace, but not overload, so users of your templated class couldn't create std::unordered_map<MyClass<whatever>, MyOtherClass>, they would be forced to choose std::unordered_map<MyClass<whatever>, MyOtherClass, ???>). So functor is solution here.

namespace std
{
    template<typename T>
    struct hash<MyVector<T>>
    {
        size_t operator()(const MyVector<T>& v)
        {
            //return hash from here
        }
    };
}

The alternative way for standard library would be using some SFINAE template trick to choose member .hash() as the default, and standard hash if the other case, but in most cases you can't retrofit interfaces (especially if using third-party code)

The other alternative would be like std::swap does (tricks with ADL):

//somewhere in std::unordered_map
using std::hash;
size_t h = hash(key);

In my experience, ADL is tricky and not everyone remembers about corner cases. Furthermore, the advantage of functors here is fact that you can use them as a template parameter, so you can just plug-in another functor for template (like std::unordered_map<A, B, specialized_hash<A>>) if you think that the default is wrongly suited for your case.

From comments:

But could you elaborate a bit more about std::swap? It's still there in C++11 and it has no problems with user-defined types, has it? Why keep a lot of different concepts in STL rather than making it more consistent?

There's a little difference between std::swap and std::hash:

In std::hash:

  • it's likely that, e.g. std::string hash defined by class author won't be enough for your case, that is, it's too generic, and you can guarantee in your hash map that you'll put strings of only one kind, so you can provide hash function that's faster or/and having less collisions.
  • there are many kind of hashes for different purposes so genericity is far less important here
  • in most cases it's possible for you to create a better hash

In std::swap:

  • it's unlikely that you'll want your own swap function, but you'll still probably want to use the one specific for this class, not the generic std::swap one that calls copy constructors.
  • in most cases it's not even possible for you to create a swap function, as it requires knowledge of the class internals (for example, std::vector can be implemented as dynamic array with pointer hidden as private field, so you won't be able to access them, not alone swap them, and even the fact that's implemented that way is not guaranteed)
  • there is (or should be) only one swap.
  • actually, there is a problem with std::swap: standard containers provide swap member function, std::swap can be specialized (but only for non-templated class), and swap can be defined as a free function that's found with ADL. How should you provide your own swap? IMO that is confusing, not the fact that std::swap is function and std::hash is a functor.

Why is STL inconsistent? I can only guess here, but main reason why STL is inconsistent is (a) bawkward compatibility and (b) C++ also is quite inconsistent.

milleniumbug
  • 15,379
  • 3
  • 47
  • 71
  • @milleniumbug: Do you have a standard reference for where it says you can't overload functions defined in the `std` namespace? I would have thought ADL would find your hash function. – Andrew Tomazos Jul 07 '13 at 22:09
  • @user1131467 `17.4.3.1/1 It is undefined for a C++ program to add declarations or definitions to namespace std or namespaces with namespace std unless otherwise specified. A program may add template specializations for any standard library template to namespace std. Such a specialization (complete or partial) of a standard library results in undefined behaviour unless the declaration depends on a user-defined name of external linkage and unless the template specialization meets the standard library requirements for the original template.`(Copied from http://stackoverflow.com/a/109613/1012936 ) – milleniumbug Jul 07 '13 at 22:20
  • 1
    Sure, but if you define `hash(MyType)` in your namespace, and then `std::unordered_map` called `hash(x)` from inside the `std` namespace, then `std::unordered_map` would use your hash function due to ADL (this is one of the main points of it). The question was _why_ does the standard library use a template functor rather than an overloaded function as just described. – Andrew Tomazos Jul 07 '13 at 22:35
  • @user1131467 I just have updated my answer - in short, functor has advantage that not only class creator can create its own hasher, but users of the class can override it if they find the default inappropriate (just like with allocators) – milleniumbug Jul 07 '13 at 22:39
  • 2
    Sorry, try again: This proposal discusses how to avoid opening std namespace for hash specialization and how to use ADL: [link](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3333.html) – Tony Jul 07 '13 at 22:47
  • @milleniumbug I think I get the message that `std::hash` is supposed to be used as a template argument and not as a function. But could you elaborate a bit more about `std::swap`? It's still there in C++11 and it has no problems with user-defined types, has it? Why keep a lot of different concepts in STL rather than making it more consistent? – Alexei Sholik Jul 08 '13 at 12:01
  • @android Since the comment field was too short, I edited my answer and put the response there. – milleniumbug Jul 08 '13 at 15:04
  • @JohanLundberg The code listing is about imaginary another-universe C++ standard library which would have `hash` as a function, just like OP suggests. The rest of the answer is about why this approach would be worse than what we have. – milleniumbug Dec 16 '15 at 19:33
4

One possible reason is that in this way it's easier to use it in templates as default by changeable option

template <typrname T, typename = std::hash<T>...>
class unordered_set;

BTW you may create function that works this way

template<typename T, typename... Args>
auto hasher(Args&&... args) -> whatever { 
    return std::hash<T>(std::forward<Args>(args)...)( //maybe some &'s  skipped
}

or (to allow detect type)

template<typename T, typename... Args> 
auto hasher(T t) -> whatever {
    return std::hash<T>()(t);
}
RiaD
  • 46,822
  • 11
  • 79
  • 123
  • `struct std::hasher { size_t operator()(T &&x) { return std::hash(std::forward(x)) } };` or something to that effect? Equality is done in a similar fashion, `operator==` is overloaded and `std::equal_to` does (by default) delegate to `operator==`. –  Jul 07 '13 at 21:38
  • It's possible, but I don't think it's so important since `hash` is not so usable feature(Usually data structures like sets need it(especially in generic way) only in. Ad if it isn't so important why we need 1 more level of indirection. As of `==` (in C): it was introduced before `std::equal_to` and equal_to was added to already existent feature – RiaD Jul 07 '13 at 21:41
  • As OP showed, it makes calling `hash` much simpler when you don't care about allowing a replacement. As for indirections: Have you looked at the standard library? ;-) –  Jul 07 '13 at 21:43
  • It's possible they didn't think about direct using at all:) – RiaD Jul 07 '13 at 21:45
  • @delnan, it may be inverted btw:) – RiaD Jul 07 '13 at 21:54