I recently raised a question in stackoverflow, then found the answer. The initial question was What mechanisms other than mutexs or garbage collection can slow my multi-threaded java program?
I discovered to my horror that HashMap has been modifed between JDK1.6 and JDK1.7. It now has a block of code that causes all threads creating HashMaps to synchronize.
The line of code in JDK1.7.0_10 is
/**A randomizing value associated with this instance that is applied to hash code of keys to make hash collisions harder to find. */
transient final int hashSeed = sun.misc.Hashing.randomHashSeed(this);
Which ends up calling
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
Looking in other JDKs, I find this isn't present in JDK1.5.0_22, or JDK1.6.0_26.
The impact on my code is huge. It makes it so that when I run on 64 threads, I get less performance than when I run on 1 thread. A JStack shows that most of the threads are spending most of their time spinning in that loop in Random.
So I seem to have some options:
- Rewrite my code so that I don't use HashMap, but use something similar
- Somehow mess around with the rt.jar, and replace the hashmap inside it
- Mess with the class path somehow, so each thread gets its own version of HashMap
Before I start down any of these paths (all look very time consuming and potentially high impact), I wondered if I have missed an obvious trick. Can any of you stack overflow people suggest which is the better path, or perhaps identify a new idea.
Thanks for the help