1

My situation is the same as in this question, but instead of using floats, I'm using integers. I feel like this difference might be significant because there might be some bit twiddling hacks that can be used to wrap the value into the range.

One use case for wrapping a value into a range is assigning a value to a bucket in a hash table, like this (range of [0,len(buckets)]):

bucket = buckets[hash(key) % len(buckets)]

If we are continuously adding values to a hash table, it might be worthwhile to optimize this operation.

Lefol
  • 21
  • 4
  • 2
    The special case of mapping to [0,n] appears to be a duplicate of [this question](https://stackoverflow.com/questions/73646834/fast-integer-division-and-modulo-with-a-const-runtime-divisor/73655977#73655977). – njuffa Sep 28 '22 at 18:46

1 Answers1

1

Since modulo isn't actually needed, there is a shortcut to mapping a hash into a new range. See Lemire's FastRange.

// map a **full-width 32-bit** hash to [0,p) without division.
uint32_t fastrange32(uint32_t word, uint32_t p) {
    return (uint32_t)(((uint64_t)word * (uint64_t)p) >> 32);
}

Simply take the high half of a 32x32 -> 64-bit multiply of word * p. If all your word values are smallish, all your results will be 0, so this is only good for word values that are fairly uniformly distributed over the full 32-bit range. Hash functions are normally fine.

I guess that is slightly biased, but that can be worked around see https://github.com/apple/swift/pull/39143


Often the number of buckets is kept as a power of 2. This allows unwanted bits to just get masked-off. (e.g. If there are 16 buckets then bucket_id = hash & 0xF.) The downside is that it requires the number of buckets to double each time the table is resized.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
aqrit
  • 792
  • 4
  • 14