Deciding the right HashCode

Question

I have a class Car:

 public class Car{

    final int uniqueId;

    Car(int id)
       {
         uniqueId=id;
       }

}

I have to store this as key in a HashMap. The max cars will be 1000. I have to decide the hashCode for this. What would be an efficient hashCode implementation for this? So the right number of buckets are formed.

The IDE generated:

@Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ID;
        return result;
    }

But it will generate a unique hashcode for each value. So it does not seems right

The first consideration for hashCode *must* be that it fulfills its contract. Efficiency of separation of codes is important but is secondary in importance. And so knowing this, one can't answer this question without knowing Car's fields, without knowing which fields are immutable, without seeing its `equals(...)` method. — Hovercraft Full Of Eels, Aug 16 '15 at 14:42
The uniqueId field is immutable.I'll mark the filed as final. Equals can be decided based on the uniqueId. — user3396729, Aug 16 '15 at 14:46
*it will generate a unique hashcode for each value. So it does not seems right*? Why doesn't it seem right? That's the best you can have: a unique value for each instance. It could be reduced to `return uniqueId;`, though. — JB Nizet, Aug 16 '15 at 14:55
The number of buckets in my hashmap. for 1000 unique ID's it will store them all in different buckets — user3396729, Aug 16 '15 at 14:56
It is worth noting that you code is the same as `31 * 1 + id` with is the same say just `id`. Also having a unique hashCode doesn't prevent collisions as your HashMap is limited in size. e.g. by default it is 16, so there is 1/16 chance that 2 entries will collide. — Peter Lawrey, Aug 16 '15 at 14:56
No, that's not how a HashMap works. The HashMap decides the number of buckets it uses. Then it uses the hashCode to decide which bucket to use for an object. It's a bit more complex than that, but you can consider it does hashCode % numberOfBuckets to find the index of the bucket to use for a given hashCode. When the number of objects become too high for the number of buckets, it creates more buckets and rearranges all the map, to limit the number of elements per bucket. — JB Nizet, Aug 16 '15 at 14:57
If you have 1000 unique ids between 0 and 999 and a HashMap of 2048, they will all be in their own bucket. If they are just unique, it is highly unlikely there won't be collisions. — Peter Lawrey, Aug 16 '15 at 14:58
@JBNizet I see. So that was what I was thinking wrong. Is there an explanation somewhere about how the map decides the number of buckets. It would really help. Thanks — user3396729, Aug 16 '15 at 15:00
I would start the [the documentation](https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html), it gives a pretty thorough explanation of how the initial number of buckets is decided (_initial capacity_) and when the implementation increases this number (_load factor_). — Boris the Spider, Aug 16 '15 at 15:04
So in my case simply returning the uniqueID seems ok? RIght. Why take a prime number and multiply — user3396729, Aug 16 '15 at 15:04
@user3396729 because if your IDs start from `0` and are numbered sequentially, you will not have a decent distribution of IDs. You ideally want the hashCode to produce a uniform distribution of values scattered around the buckets evenly. — Boris the Spider, Aug 16 '15 at 15:06
Yes it's OK. That's what the hashCode() method of java.lang.Integer does. Eclipse multiplies by a prime number because it generated code that is usually used with several fields, of several different types. — JB Nizet, Aug 16 '15 at 15:06
@BoristheSpider If won't be 0 that the ID generator ensures in my case. Just thinking about the decent distributions thing you said. — user3396729, Aug 16 '15 at 15:10
@JBNizet Just thinking about what Boris said on the distribution thing — user3396729, Aug 16 '15 at 15:11
I don't see why a sequential distribution of IDs wouldn't be good. It's the best you can get. — JB Nizet, Aug 16 '15 at 15:16
That's what I was thinking. Why do the overhead of multiplying — user3396729, Aug 16 '15 at 15:18
ok, Thanks guys for the help. Really appreciate the time you took for helping out. — user3396729, Aug 16 '15 at 15:34

Deciding the right HashCode

0 Answers0