4

If I needed to hash an entire HashSet<T> or HashMap<T, U>, where T already had some hash algorithm implemented, how would I do it? Note that I am not asking about hashing elements of a hash table, I'm talking about hashing the entire data structure itself. This is not too difficult with an ordered set like a TreeSet, but because the order of the elements of a hash table is not well-defined, this is more tricky. Sorting the elements is infeasible in the general case, as the algorithm should take no more than O(n) time.

I'm looking for a general, language independent example, but you can provide code or links to code from any language.

gsgx
  • 12,020
  • 25
  • 98
  • 149

2 Answers2

4

Your options are to

  1. Enforce an order for purposes of creating the hash
  2. Apply a hash algorithm that is commutative (independent of order)

The first option may be viable if the number of elements is relatively small. You can sort the hash elements e.g. by hash value (of each element), then apply well-known hash-combining techniques such as multiplying each successive element's contribution to the hash by (SomePrime)^n.

For the second option, simply adding the hash of each element in the hash together may provide a suitable distribution, since the hash of each element itself should already be very well distributed.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • For context, Python uses option #2 for it's `frozenset` hash: http://stackoverflow.com/questions/20832279/python-frozenset-hashing-algorithm-implementation . – orlp Apr 28 '15 at 04:22
  • @orlp I was just about ask for example of 2, as 1 wouldn't work for a large hash table. Any other references, examples, or links would be appreciated. – gsgx Apr 28 '15 at 04:24
  • Sorting is infeasible in the general case, as I'd like the algorithm to take no more than O(n). – gsgx Apr 28 '15 at 04:25
  • Like I said, 1 *may be viable if the number of elements is relatively small*. If n << infinity, it's considered O(1) (since Big O is concerned with behavior as n approaches infinity). If n is not "small", you are left with option 2. – Eric J. Apr 28 '15 at 04:38
0

Introduce new field for datastructure where you can keep hashbase. On each addition of element to hashmap/hahset do something like hashbase += element.hash if element is not yet there. Use this hashbase for hash calculation.

edmarisov
  • 138
  • 5