8

I would like to globally replace the common thread pool used by default by the Java parallel streams, i.e., for example for

IntStream.range(0,100).parallel().forEach(i -> {
    doWork();
});

I know that it is possible to use a dedicated ForkJoinPool by submitting such instruction to a dedicated thread pool (see Custom thread pool in Java 8 parallel stream ). The question here is

  • Is it possible to replace the common ForkJoinPool by some other implementation (say a Executors.newFixedThreadPool(10)?
  • Is it possible to do so by some global setting, e.g., some JVM property?

Remark: The reason why I like to replace the F/J pool is, because it appears to have a bug which makes it unusable for nested parallel loops.

Nested parallel loops have poor performance and may lead to deadlocks, see http://christian-fries.de/blog/files/2014-nested-java-8-parallel-foreach.html

For example: The following code leads to a deadlock:

// Outer loop
IntStream.range(0,24).parallel().forEach(i -> {

    // (omitted:) do some heavy work here (consuming majority of time)

    // Need to synchronize for a small "subtask" (e.g. updating a result)
    synchronized(this) {
        // Inner loop (does s.th. completely free of side-effects, i.e. expected to work)
        IntStream.range(0,100).parallel().forEach(j -> {
            // do work here
        });
    }
});

(even without any additional code at "do work here", given that parallelism is set to < 12).

My question is how to replace the FJP. If you like to discuss nested parallel loops, you might check Nested Java 8 parallel forEach loop perform poor. Is this behavior expected? .

Community
  • 1
  • 1
Christian Fries
  • 16,175
  • 10
  • 56
  • 67
  • You keep posting identical questions both here and on the concurrency-interest list. Wait for your answer from Doug before posting here. – edharned May 09 '14 at 14:05
  • @edharned: On that list they always start academic discussions about compensation threads generated, etc. (which might explain a small performance hit). In my test I printed all the threads and I get a deadlock without any compensation thread. I just wanted a fix of a possibly serious bug. Since I didn't got any answer related to that bug I was considering replacing the FJP as Holger did. So here I don't want to start this stuff again. I wanted to replace the FJP. - Maybe I should turn to some other framework? Akka Streams? Scala? Any suggestion? – Christian Fries May 09 '14 at 20:18
  • People here don't know what is being said there, so people here will post duplicate questions/answers. I agree the FJP is faulty. No suggestions today. Eventually Oracle will have to do something radical to replace it, but not soon I suspect. – edharned May 10 '14 at 22:18
  • Read the java docs. It's not a parallel execution framework. That's why we have ExecutorService. What should it do anyway ? Spawn 100 threads ? parallel() might be a misleading name but it's a fine framework with very great performance if you just read the manual and learn the semantics. parallel() does not speed up your code magically. Parallel in this context mean, that the task is split up recursively and processed on multiple cores. Yes it increases stack usage, but I read an article about the JVM lately that this kind of problem is highly optimized. – Kr0e Jul 10 '14 at 14:26
  • Streams are not a drop-in replacement for for-loops. Java remains a highly imperative language. Streams are good at collection manipulation. You could write your own stream-impl based on an ExecutorService easily. – Kr0e Jul 10 '14 at 14:32
  • @Kr0e (I am a bit tired of repeating this ;-) , but...) This is not about the use of forEach being good style, this is about a serious bug with deadlocks or massive performance problems. This might also hit you if you use map-reduce (but the demo is more involved then). (The deadlock has nothing to do with the need to spawn 100 threads. Just play with the code in a debugger and check where it hangs. It is a deadlock situation due to a bug in the ForkJoinTask implementation.) – Christian Fries Jul 10 '14 at 14:47

2 Answers2

7

I think that's not the way the stream API is intended to be used. It seems you're (mis)using it for simply doing parallel task execution (focusing on the task, not the data), instead of doing parallel stream processing (focusing on the data in the stream). Your code somehow violates some of the main principles for streams. (I'm writing 'somehow' as it is not really forbidden but discouraged): Avoid states and side effects.

Apart from that (or maybe because of side effects), you're using heavy synchronization within your outer loop, which is everything else but harmless!

Although not mentioned in the documentation, parallel streams use the common ForkJoinPool internally. No matter whether or not this is a lack of documentation, we must simply accept that fact. The JavaDoc of ForkJoinTask states:

It is possible to define and use ForkJoinTasks that may block, but doing do requires three further considerations: (1) Completion of few if any other tasks should be dependent on a task that blocks on external synchronization or I/O. Event-style async tasks that are never joined (for example, those subclassing CountedCompleter) often fall into this category. (2) To minimize resource impact, tasks should be small; ideally performing only the (possibly) blocking action. (3) Unless the ForkJoinPool.ManagedBlocker API is used, or the number of possibly blocked tasks is known to be less than the pool's ForkJoinPool.getParallelism level, the pool cannot guarantee that enough threads will be available to ensure progress or good performance.

Again, it seems that you're using streams as replacement for a simple for-loop and an executor service.

  • If you just want to execute n tasks in parallel, use an ExecutionService
  • If you have a more complex example where tasks are creating subtasks, consider using a ForkJoinPool (with ForkJoinTasks) instead. (It ensures a constant number of threads without the danger of a deadlock because of too many tasks waiting for others to complete, as waiting tasks do not block their executing threads).
  • If you want to process data (in parallel), consider using the stream API.
  • You cannot 'install' a custom common pool. It's created internally in private static code.
  • But you can take influence on the parallelism, the thread factory and the exception handler of the common pool using certain system properties (see JavaDoc of ForkJoinPool)

Don't mix up ExecutionService and ForkJoinPool. They are (usually) not a replacement for each other!

isnot2bad
  • 24,105
  • 2
  • 29
  • 50
  • Thank you and +1 for giving a partial (albeit negative) answer. With respect to your other remarks: The reason why I like to replace the FJP is, because it has a bug. This bug is also responsible for very poor performance of nested loops. In my application that bit of synchronization is harmless (around a small block of code), but I do not like to redesign my whole framework just because of a bug that should get fixed (hopefully soon). See http://christian-fries.de/blog/files/2014-nested-java-8-parallel-foreach.html - I have updated my question to make that a bit clearer. – Christian Fries May 14 '14 at 19:30
  • I've read through your other SO questions and forum threads related to this bug. And I already did some testing with your code and modifications of it. I'm not fully convinced yet that this is a bug (the code of ForkJoinTask/-Pool is complex and concurrent applications are usually hard to debug). I'll do some more testing as soon as I have time to spare. – isnot2bad May 15 '14 at 07:08
  • However, still I think that synchronized should not be used here. And I think it does not make sense to parallelize things within a synchronized block that acts as a barrier for the same ForkJoinPool-worker-threads that are used for the wanted parallelization. So the inner loop always runs single threaded in your case. Maybe you could use a Phaser instead - it is thought to work cooperatively with ForkJoinPools. – isnot2bad May 15 '14 at 07:13
  • The only reasonable place for a synchronize, is INSIDE s.th. which is expected to be concurrent. Sure, the synchronized part should be small compared to the remainder (which I just omitted). Your conclusion is wrong: ForkJoinPool will parallelize the inner loop too by creating compensation threads. I know it is not guaranteed to have enough compensation threads, but the example above actually works very nicely - if the bug in the FJP implementation is fixed! And even if not: just consider the case where you have 64 cores/parallelism, the outer loop having 8 (big) task and the inner loop 100... – Christian Fries May 19 '14 at 15:38
1

Although your original question is already answered well by isnot2bad, it might be important for you that the described bug (the reason for your wish to exchange the FJP implementation) seems to be fixed with java 1.8.0.40. See Nested Java 8 parallel forEach loop perform poor. Is this behavior expected?

Community
  • 1
  • 1
Till Schäfer
  • 642
  • 6
  • 15
  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient [reputation](http://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](http://stackoverflow.com/help/privileges/comment). – recursion.ninja Jul 08 '15 at 18:34
  • @recursion.ninja : this answer is newer, it is not as detailed as the first one. But it answers the question, since it advises to upgrade to java 1.8.0.40. It seems a valuable piece of information to me. – francis Jul 08 '15 at 18:52
  • @francis The answer should be expanded then, rather then linking to another question. Consider including relevant excerpts from the other question into this answer (linking for attribution) and explaining why the upgrade will resolve the asker's specific problem. – recursion.ninja Jul 08 '15 at 19:02