0

I am looking for a simple incremental hash function (C++), such that it can be updated using:

               hash = hash_function(hash, update_value)

update_value for example can be a single bit.

for example, in order to compute the hash of an array, I would do:

              hash = 0
              foreach element a in array { hash = hash_function(hash, a) }

(preferably something that doesn't lead to too many collisions, but relatively fast.)

kloop
  • 4,537
  • 13
  • 42
  • 66
  • Why don't you just XOR the hash of every element in the array together – Pete Baughman Jun 04 '14 at 20:51
  • @Pete, the elements in the array can be just bits. XOR is a bit idea, the hash will just be a single bit. I need some better transformation, and I don't want it to lead to many collisions. – kloop Jun 04 '14 at 21:18
  • See incremental checksums here: http://stackoverflow.com/q/1173481/583570 – markgz Jun 04 '14 at 23:15

1 Answers1

1

If you're hashing an array of bits:

You could implement a Cyclic redundancy check. The CRC polynomial would determine the hash length and would (roughly) control the likelihood of a collision. Many example software CRC algorithms are optimized to operate on things wider than bits, but the core, un-optimized algorithm works a bit at a time. The algorithm is roughly:

  1. Start with some constant seed value in the accumulator
  2. Shift a bit from your array into the accumulator.
  3. Conditionally, perform an XOR with the polynomial. Different implementations either use the bit you just shifted out, the bit you just shifted in for the conditional.
  4. Repeat for subsequent bits (Goto 2).

Your proposed method would take the current accumulator value as the 1st argument, and return the next accumulator value.

Polynomial selection is important. There are some polynomials that are not considered good for hashing.

If the array contains something a little wider (like ints or objects):

You could just hash each element, and combine the hash of each element together with something like XOR. If the hash algorithm for the individual objects is good, then the resulting hash for the array should be relatively ok too. Note that it's very important to hash the individual objects first.

Pete Baughman
  • 2,996
  • 2
  • 19
  • 33