0

I assume that there is a reason behind this design choice. Boost seems to have an implementation for it, hence it should be possible to use vectors as hash table keys. Are there any theoretical properties for hash functions applied to arrays that make them more prone to collisions or other undesirable behavior?

A. Fenzry
  • 414
  • 3
  • 12
  • I don't think there's any reason why you can't use vectors as keys to an unordered_map; you'll just need to specify a hash function that computes a hash value from a vector. – Jeremy Friesner May 15 '20 at 00:52
  • Can you provide an example of a hash function suitable for vectors? – A. Fenzry May 15 '20 at 01:00
  • 3
    Not being constant time complexity seems like something that might have influenced the decision not to specialize `std::hash` for `std::vector`, making sure that the user knows what they're getting into if they want to use one as a key. That's a guess, though. – chris May 15 '20 at 01:03
  • Good observation @chris – A. Fenzry May 15 '20 at 01:07
  • 1
    You need to roll your own hash function for that. Not difficult at all. – user1095108 May 15 '20 at 01:09
  • @A.Frenzy Your hash-function could simply compute a hash-value from each element in the vector and sum them all together (or maybe multiply each element's hash-value by (index+1) before adding it to the sum, so that two vectors with the same values but in a different order will have different hash-values from each other). – Jeremy Friesner May 15 '20 at 02:07

1 Answers1

3

You'll notice Boost doesn't actually have an extension to accept a vector<T> as a key specifically - instead it has an extension that lets you use any Iterable - and it generates the hash only as a function of the Iterable's contents...

This may or may not be desirable, depending on:

  • If you want to use object-identity rather than object-value as the basis for hashing... or not.
  • If you're comfortable with hashing being a non-constant-time operation... or not.
    • Just because boost::hash_range appears to be O(n) doesn't mean the underlying iterable won't take 5 minutes to return all hashable values for each call...
  • If the order of elements does - or doesn't matter.
    • (I believe) using boost::hash_range or boost::hash_combine with one of two distinct but equivalent unordered_set objects will result in different hash-codes despite their value-equivalence.
  • If two conceptually different objects that can iterate over the same values (e.g. a vector<uint8_t> representing a data buffer, or queue<SomeEnum> where SomeEnum : uint8_t representing a queue of values) should have the same hahs-code... or not.

I suspect the team behind the STL doesn't like the fact that there's so many contextual "if"s described above which would mean it wouldn't be sensible to provide default behaviour and so they require you to always be more explicit with your hash-generation for arbitrary objects (besides, if you want Boost's behaviour, then just use Boost in the first place - it's not like Boost is competing with the STL).

Also see this QA: C++ unordered_map using a custom class type as the key

Dai
  • 141,631
  • 28
  • 261
  • 374