4

My Java program uses java.util.concurrent.Executor to run multiple threads, each one starts a runnable class, in that class it reads from a comma delimited text file on C: drive and loops through the lines to split and parse text into floats, after that data is stored into :

static Vector
static ConcurrentSkipListMap

My PC is a Win 7 64bit, Intel Core i7, has six * 2 cores and 24GB of RAM, I have noticed the program will run for 2 minutes and finish all 1700 files, but the CPU usage is only around 10% to 15%, no matter how many threads I assign using :

Executor executor=Executors.newFixedThreadPool(50);

Executors.newFixedThreadPool(500) won't have a better CPU usage or shorter time to finish the tasks. There is no network traffic, everything is on local C: drive, There is enough RAM for more threads to use, it will have an "OutOfMemoryError" when I increase the threads to 1000.

How come more threads doesn't translate to more CPU usage and less time of processing, why ?

Edit : My hard drive is a SSD 200 GB.

Edit : Finally found where the problem was, each thread writes it's results to a log file which is shared by all threads, the more times I run the app, the larger the log file, the slower it gets, and since it's shared, this definitely slows down the process, so after I stopped writing to the log file, it finishes all tasks in 10 seconds !

Frank
  • 30,590
  • 58
  • 161
  • 244
  • 7
    More threads won't make your disk faster. – user2357112 Jul 24 '13 at 03:15
  • Google for some graphical CPU time analyser, and check how much time the CPU spends in the "IO wait" state. This is affected by latency of the disk (and other resources). FYI, a good SSD disk reduces the IO wait to almost zero. – Ondra Žižka Jul 24 '13 at 03:20
  • Do your threads have any synchronization? – jpmc26 Jul 24 '13 at 03:37
  • Sync is handled by Executor, isn't it ? I don't have any thing else going on. – Frank Jul 24 '13 at 03:49
  • You would probably only have synchronization if your threads access the same or some of the same objects. This might come in the form of using an object with internal synchronization (such as `Vector` or `ConcurrentSkipListMap`, **which you're using**) or via `synchronized`. Do multiple threads access the same `Vector` or `ConcurrentSkipListMap`? – jpmc26 Jul 24 '13 at 03:57
  • Thanks for pointing that out ! Yes, the result data class are saved into Vector and ConcurrentSkipListMap, that might also be a bottle neck ? – Frank Jul 24 '13 at 04:04
  • Unless each thread has it's own `Vector` and `ConcurrentSkipListMap`, **yes**. It most certainly can. See my answer for an edit. – jpmc26 Jul 24 '13 at 04:13

3 Answers3

4

The OutOfMemoryError is probably coming from Java's own limits on its memory usage. Try using some of the arguments here to increase the maximum memory.

For speed, Adam Bliss starts with a good suggestion. If this is the same file over and over, then I imagine having multiple threads try to read it at the same time could result in a lot of contention over locks on the file. More threads would even mean more contention, which could even result in worse overall performance. So avoid that and simply load the file once if it's possible. Even if it's a large file, you have 24 GB of RAM. You can hold quite a large file, but you may need to increase the JVM's allowed memory to allow the whole file to be loaded.

If there are multiple files being used, then consider this fact: your disk can only read one file at a time. So having multiple threads trying to use the disk all at the same time probably won't be too effective if the threads aren't spending much time processing. Since you have so little CPU usage, it could be that the thread loads part of the file, then runs very quickly on the part that got buffered, and then spends a lot of time waiting for the rest of the file to load. If you're loading the file over and over, that could even still apply.

In short: Disk IO probably is your culprit. You need to work to reduce it so that the threads aren't contending for file content so much.

Edit:

After further consideration, it's more likely a synchronization issue. Threads are probably getting held up trying to add to the result list. If access is frequent, this will result in huge amounts of contention for locks on the object. Consider doing something like having each thread save it's results in a local list (like ArrayList, which is not thread safe), and then copying all values into the final, shared list in chunks to try to reduce contention.

Community
  • 1
  • 1
jpmc26
  • 28,463
  • 14
  • 94
  • 146
  • Each thread only read one file, and no two threads will read the same file, my app only read each file once. Although I have a SSD, you have a good point, that it may not be able to handle multiple file reads at the same time, that might be the bottle neck. Thanks ! – Frank Jul 24 '13 at 03:47
  • A disk does not read files at all, it reads disk blocks, one at a time. Which means that many files can indeed be read concurrently. – Ingo Jul 24 '13 at 14:29
  • @Ingo Technically, sure. But that's minutiae that doesn't change the point, especially with a large set of files. The files are not likely to be crowded into a small disk space, and we certainly can't assume that's the case. In the end, the disk can only read one set of data at a time, which means that any thread looking for something outside that block is held up by the I/O. – jpmc26 Jul 24 '13 at 17:43
1

You're probably being limited by IO, not cpu.

Can you reduce the number of times you open the file to read it? Maybe open it once, read all the lines, keep them in memory, and then iterate on that.

Otherwise, you'll have to look at getting a faster hard drive. SSDs can be quite speedy.

Adam Bliss
  • 645
  • 4
  • 9
1

It is possible that your threads are somehow given low priority on the system? Increasing the number of threads in that case wouldn't correspond to an increase in CPU usage, since the amount of CPU space allotted to your program may be throttled somewhere else.

Are there any configuration files/ initialization steps where something like this could possibly occur?

krishnakid
  • 108
  • 5