1

In many books, syllabus, tutorials I've seen that a good option to find a proper cell of an item is to calculate a number of the cell: item.hash()%(n-1) = # of the bucket.

But why is this certain expression is mentioned?

How does the inverse one (n-1)%item.hash() = # of the bucket differs from it?

P.S. I know that Java HashMap uses (n - 1) & hash, I would like only to catch the difference in sparsing key between these two approaches.

Pasha
  • 1,768
  • 6
  • 22
  • 43
  • Did you mean (n-1) **%** item.hash() or (n-1) **&** item.hash()? – Julius Häger Aug 28 '18 at 22:46
  • 2
    Where do you get that *inverse* code from? Why should it be the number of buckets? It computes something completely different. `5 % 11 = 5` while `11 % 5 = 1`. Modulo is **not** commutative. – Zabuzard Aug 28 '18 at 22:47
  • Think of the `%` operator as computing the remainder after dividing the left argument by the right. This operator is not symmetric in its arguments. You wouldn't expect that the remainder of, say, 12/5 to be the same as the remainder of 5/12, would you? The `&` operator, on the other hand, is symmetric in its arguments. – Ted Hopp Aug 28 '18 at 22:49
  • Also see [Bitwise and in place of modulus operator](https://stackoverflow.com/questions/3072665/bitwise-and-in-place-of-modulus-operator). – Zabuzard Aug 28 '18 at 23:03

3 Answers3

2

How does the inverse one (n-1)%item.hash() = # of the bucket differs from it?

Basically it doesn't work.

That expression needs to reduce the hash code to a value in the range 0 .. n - 1, so that it can be used as a subscript for an array of size n.

But the "inverse" function doesn't do that. So if you tried to use it, the (alleged) bucket subscripts would give exceptions since h % (n - 1) > (n - 1) or < 0 for most values of h in the range of the Java int type.

As @Zubuza notes remainder (%) and division (/) are not commutative.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
2

Think of operator modulus % as a way to distribute uniformly a set of numbers through reducing them over a smaller range. The set of numbers are, of corse, the hashcodes of input keys. The small range is the capacity of the table.

This is a useful technique when you want to assign an index in a small table to store a high number.

The inverse operation sounds quite weird (and useless): Taking in account that the hash codes are high numbers and n is small, n % hash would return always n, so it has no interest at all.

Java choses indexes through hash & (length-1), indeed, which is not aritmetically equivalent to hash % length, but it is an alternative -and cheaper than modulus- formula to reduce and distribute (credits to @Zabuza).

Little Santi
  • 8,563
  • 2
  • 18
  • 46
  • And `hash & (length-1)` is equivalent to `(length-1) & hash` since `&` (bitwise and) is commutative. So maybe OP read `hash % length` vs `(length - 1) & hash`, or any other similar combination. – Zabuzard Aug 28 '18 at 23:00
  • @Zabuza Right. The difference is lying over the `-1`. – Little Santi Aug 28 '18 at 23:04
  • Note that `%` is not always the same as `&`. In Javas code it works, obviously. But only due to how its done. See [Bitwise and in place of modulus operator](https://stackoverflow.com/questions/3072665/bitwise-and-in-place-of-modulus-operator). – Zabuzard Aug 28 '18 at 23:04
1

The difference between hash % n vs n % hash is that hash % n is going to be a lot more distributed than n % hash.

n % hash will almost always be equivalent to n, because a % b where b > a is equal to a (E.g. 15 % 30 = 15).

I created a graph to show the differences (red is x % n and blue is n % x where x represents the hash). Graph

The idea in java is to avoid the 'expensive' % (mod) operation and instead do the comparably cheap & (and) operation. But that only works when n is a power of 2. So a Java HashMap always uses a power of 2 for the number of buckets.