2

So, I have a loop where I create thousands of threads which process my data.

I checked and storing a Thread slows down my app.

It's from my loop:

Record r = new Record(id, data, outPath, debug);
//r.start();
threads.add(r);

//id is 4 digits
//data is something like 500 chars long

It stop my for loop for a while (it takes a second or more for one run, too much!).

Only init > duration: 0:00:06.369

With adding thread to ArrayList > duration: 0:00:07.348


Questions:

  • what is the best way of storing Threads?
  • how to make Threads faster?
  • should I create Threads and run them with special executor, means for example 10 at once, then next 10 etc.? (if yes, then how?)
Shaq
  • 377
  • 5
  • 16
  • 1
    what is threads reference here ? – M Sach Jan 05 '16 at 14:53
  • giving [this](http://stackoverflow.com/questions/763579/how-many-threads-can-a-java-vm-support) question it could be that you machine cannot handle that many threads and gets unstable. – SomeJavaGuy Jan 05 '16 at 14:55
  • Use an Executor if you aren't. And I believe it's better to store the Future's instead of the thread references. Don't know if this will impact performance though. – Reinard Jan 05 '16 at 14:57

3 Answers3

7

Consider that having a number of threads that is very high is not very useful.

At least you can execute at the same time a number of threads equals to the number of core of your cpu.

The best is to reuse existing threads. To do that you can use the Executor framework.

For example to create an Executor that handle internally at most 10 threads you can do the followig:

List<Record> records = ...;

ExecutorService executor = Executors.newFixedThreadPool(10);

for (Record r : records) {
   executor.submit(r);
}

// At the end stop the executor
executor.shutdown();

With a code similar to this one you can submit also many thousands of commands (Runnable implementations) but no more than 10 threads will be created.

Davide Lorenzo MARINO
  • 26,420
  • 4
  • 39
  • 56
  • So I did that and there are results: @davide-lorenzo-marino For 1.5mb file Threads count | Time ;1 | 1,42 ;2 | 1,20 ;4 | 1,13 ;5 | 1,09 ;6 | 1,09 ;8 | 1,13 ;10 | 1,07 For 4mb file ;2 | 6,16 ;5 | 5,50 It is 3 times larger but 6 time slower with another file. – Shaq Jan 08 '16 at 13:39
0

I'm guessing that it is not the .add method that is really slowing you down. My guess is that the hundreds of Threads running in parallel is what really is the problem. Of course a simple command like "add" will be queued in the pipeline and can take long to be executed, even if the execution itself is fast. Also it is possible that your data-structure has an add method that is in O(n).

Possible solutions for this: * Find a real wait-free solution for this. E.g. prioritising threads. * Add them all to your data-structure before executing them

While it is possible to work like this it is strongly discouraged to create more than some Threads for stuff like this. You should use the Thread Executor as David Lorenzo already pointed out.

Geki
  • 247
  • 2
  • 11
0

I have a loop where I create thousands of threads...

That's a bad sign right there. Creating threads is expensive.

Presumeably your program creates thousands of threads because it has thousands of tasks to perform. The trick is, to de-couple the threads from the tasks. Create just a few threads, and re-use them.

That's what a thread pool does for you.

Learn about the java.util.concurrent.ThreadPoolExecutor class and related classes (e.g., Future). It implements a thread pool, and chances are very likely that it provides all of the features that you need.

If your needs are simple enough, you can use one of the static methdods in java.util.concurrent.Executors to create and configure a thread pool. (e.g., Executors.newFixedThreadPool(N) will create a new thread pool with exactly N threads.)

If your tasks are all compute bound, then there's no reason to have any more threads than the number of CPUs in the machine. If your tasks spend time waiting for something (e.g., waiting for commands from a network client), then the decision of how many threads to create becomes more complicated: It depends on how much of what resources those threads use. You may need to experiment to find the right number.

Solomon Slow
  • 25,130
  • 5
  • 37
  • 57