0

Consider the standard Murmurhash, giving 32-bit output values.

Suppose that we apply it on 32-bit inputs -- are there collisions?

In other words, does Murmurmash basically encodes a permutation when applied to 32-bit inputs? If collisions exist, can anyone give an example (scanning random inputs didn't yield any)?

M A
  • 209
  • 2
  • 7

1 Answers1

2

I assume you mean MurmurHash3, 32 bit, and specially the 32-bit fmix method:

FORCE_INLINE uint32_t fmix32 ( uint32_t h )
{
  h ^= h >> 16;
  h *= 0x85ebca6b;
  h ^= h >> 13;
  h *= 0xc2b2ae35;
  h ^= h >> 16;

  return h;
}

If not, then you need to better specify what you mean.

For the above, there are no collisions (two distinct inputs won't result in the same output). There is only one entry that returns the input value: 0.

As there are not "that many" 32-bit values, you can actually iterate over all of them to verify, in a couple of minutes. This will require some memory for a bit field, but that's it.

Btw, there is also a way to reverse the function (get the input from the output).

Thomas Mueller
  • 48,905
  • 14
  • 116
  • 132
  • Thanks! Is it known that this methods have no collisions? Is there a way to prove it other than trying all options? – M A Jul 15 '21 at 08:27
  • 1
    Yes, I think there is a way to prove it, but I'm afraid it's beyond what I could do. See also [this answer](https://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key/12996028#12996028). I think in general, if there is a way to reverse a function, it is guaranteed there can't be any collisions. Otherwise, it wouldn't be possible to revert. – Thomas Mueller Jul 15 '21 at 09:20
  • That makes sense, the question is how did they verify the reverse function (that'd be the proof in this case). – M A Aug 10 '21 at 18:21
  • I verified using brute force :-) – Thomas Mueller Aug 11 '21 at 09:28