5

I'm trying to use an Avalanche mixer to hash integer coordinates. I've been using Murmur3's 32bit and 64bit avalanche mixers to do so (and not the actual total hash function). For my application the entire hash function is not needed, only the Avalanche Mixer seen here:

uint32_t murmurmix32( uint32_t h )
{
  h ^= h >> 16;
  h *= 0x85ebca6b;
  h ^= h >> 13;
  h *= 0xc2b2ae35;
  h ^= h >> 16;

  return h;
}


uint64_t murmurmix64( uint64_t h )
{
  h ^= h >> 33;
  h *= 0xff51afd7ed558ccdULL;
  h ^= h >> 33;
  h *= 0xc4ceb9fe1a85ec53ULL;
  h ^= h >> 33;

  return h;
}

These appear fast on my machine, I take two uint32_ts and mix them into these functions to produce avalanched results, this produces a psuedorandom distribution to my liking.

I want to introduce more coordinates to this system (ie z and w), so I want to use larger avalanche mixers to hash my coordinates. I believe for my puroposes the max value I want to see come out of the function itself is uint64_t, collisions themselves are not a problem, but the randomness of the results are.

It does not appear that murmur3 has a larger avalanche mixer than 64. I've looked at this website and this one to get a few clues on some alternative avalanche hashes:

The quality of these avalanches seem to be good enough for my application but I'm particularly interested in City hash's murmur inspirations.

In CityHash, they have a "murmur inspired" mixer:

uint64 Hash128to64(const uint64_t& x_high, const uint64_t& x_low) {
  // Murmur-inspired hashing.
  const uint64 kMul = 0x9ddfea08eb382d69ULL;
  uint64 a = (x_low ^ x_high) * kMul;
  a ^= (a >> 47);
  uint64 b = (x_high ^ a) * kMul;
  b ^= (b >> 47);
  b *= kMul;
  return b;
}

This seems quite fast for two 64 bit numbers. I'm confused as to how they derived their own "inspired" hash from Murmur. How would one go about creating their own 2^n bit murmur avalanche mixer?

derrysan7
  • 457
  • 1
  • 5
  • 15
Krupip
  • 4,404
  • 2
  • 32
  • 54
  • I think it needs a lot of experimentation. You need to have some reversible transformations (like multiplying with an odd number, xoring with a right-shifted-itself, etc.), then put them in some order, then measure avalanche. Note: if I'm not mistaken, you need this for your noise generator. In this case, your output has less bits, than your input, so these general purpose hash functions do too much. So maybe you can find a little bit simpler hash function, which has good avalanche properties for only the low X bits. – geza Jul 20 '17 at 15:26
  • @geza Yes, I need this for noise. I've not found any simpler avalanche mixers for the size I'm looking at. Currently I'm using murmur 3 which work well and is fast, murmur seems to be the fastest i can manage and still get good avalanche behavior. – Krupip Jul 20 '17 at 15:54

2 Answers2

2

Pelle Evensen has already largely answered your question, "How to create a custom Murmur Avalanche Mixer", on his blog posts on mostlymangling.blogspot.com, especially the following two:

Peter O.
  • 32,158
  • 14
  • 82
  • 96
0

If you really are interested not in collisions, but in the randomness of the results, then you should try to use PRNG with 128bits state and 64bits output.

And pretty good is well-known PRNG called Xoroshiro128+ - fast, quite good randomness.

Code could be found here

UPDATE

Yes, looks like problem to use it for caching due to the fact RNG returns first just a sum modulo 264. Wondering if simple modification (basically, moving result computation after the rotations/xors) will help

static inline uint64_t rotl(const uint64_t x, int k) {
    return (x << k) | (x >> (64 - k));
}

uint64_t next(uint64_t* s) {
    const uint64_t s0 = s[0];
    uint64_t s1 = s[1];

    s1 ^= s0;
    s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
    s[1] = rotl(s1, 36); // c

    return s[0] + s[1];
}
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
  • See [my previous question](https://stackoverflow.com/questions/45120396/why-does-switching-from-mersenne-twister-to-other-prngs-in-gradient-noise-genera) on why PRNGs typically don't actually work for this sans adding another hash function on top of the output. While they do well on arbitrary seeds in producing a good PRNG output, due to the poor avalanching behavior (which is not needed typically for PRNG) if you have a seed with only one bit of difference you will often get pretty bad PRNG results. In my previous post even Xorshift+ did poorly producing odd output. – Krupip Jul 21 '17 at 03:20
  • @snb `poor avalanching behavior (which is not needed typically for PRNG)` - it was used to be like that but not anymore, people are adding tests for PRNG characteristics similar to good avalanching behavior, and Xoroshiro128+ passed them all. Code is here, just try it... – Severin Pappadeux Jul 21 '17 at 13:53
  • Note that the underlying problem is that a random number is generated directly after seeding the PRNG. In the case of `xoroshiro128+` (and `xorshift128+`), that random number is just the sum of the two halves of the seed, mod 2^64. – Peter O. Jul 21 '17 at 17:28
  • @PeterO. Well then, that's an easy problem to fix - skip first number and start with second one – Severin Pappadeux Jul 21 '17 at 19:22
  • @PeterO Ok, that was wrong statement, and you're right, for hash purposes having returning sum is bad. Updated answer. – Severin Pappadeux Jul 21 '17 at 20:09
  • @SeverinPappadeux: the thing is, these xorshift generators are very sensitive to seeding. So, for example, if you put generated random numbers to a 2D image, where the s[0]=x, s[1]=y, and the image pixel is the output from the generator, you'll see patterns. And you'll even see patterns, if you generate 1000 numbers after seeding, and use 1001st as output. – geza Jul 22 '17 at 13:00