I have the code which reads a set of binary files which essentially consist from a lot of serialized java objects. I'm trying to parallelize the code, by running the reading of the files in the thread pool ( Executors.newFixedThreadPool
)
What I'm seeing is that when threaded, the reading runs actually slower than in a single thread -- from 1.5 to 10 time slower,depending on the number of threads.
In my test-case I'm actually reading the same file (35mb) from multiple threads, so I'm not bound by I/O in any way. I do not run more threads than CPUs and I do not have any synchronization between pools -- i.e. I'm just processing independently a bunch of files.
Does anyone have an idea what could be a possible reason for this slow performance when threaded ? What should I look for ? Or what's the best way to dissect the problem? I already looked for static variables in the classes, which could be shared between threads and I don't see any.
Can one of the java.*
classes when instantiated in the thread run significantly slower, (e.g. java.zip.deflate
which I'm using)?
Thanks for any hints.
Upd: Another interesting hint is that when a single thread is running the execution time of the function which does the reading is constant to high precision, but when running multiple threads, I see significant variation in timings.