0

Performance difference on 2 different streams executions

I try to do the same operation for default parallel stream and using custom ForkJoin pool.

I see huge performance difference for the same operation. 94 ms vs ~5341 ms (Time1 and Time2 are almost the same - so I don't blame awaitQuiescence here)

What can be a reason ? Tricky java intrinsic ?

public final class SharedForkJoinExecutor {
        private static final Logger LOGGER = LoggerFactory.getLogger(SharedForkJoinExecutor.class);


    private static final ForkJoinPool EXEC = new ForkJoinPool(ForkJoinPool.commonPool().getParallelism(),
                                                              pool -> {
                                                                  final ForkJoinWorkerThread aThread = ForkJoinPool.defaultForkJoinWorkerThreadFactory.newThread(pool);
                                                                  aThread.setName("ForkJoin-Executor-" + aThread.getPoolIndex());
                                                                  return aThread;
                                                              },
                                                              (t, e) -> LOGGER.info(e.getMessage(), e),
                                                              true);

    /**
     * Shuts down this executor
     */
    public static void shutdown() {
        EXEC.shutdown();
    }

    public static ForkJoinPool get() {
        return EXEC;
    }
}
package com.stream;

import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.IntStream;

import com.stream.SharedForkJoinExecutor;

import org.junit.Test;

import static junit.framework.TestCase.assertEquals;

public class ForkJoinTest {

    private static final int INT_NUMBERS = 1_000_000;

    @Test
    public void forEachIntTest() {
        final AtomicInteger aEvenCounter = new AtomicInteger(0);
        final AtomicInteger aAllCounter = new AtomicInteger(0);

        long t = System.currentTimeMillis();
        IntStream.range(0, INT_NUMBERS).parallel().forEach(theIndex -> {
            if (theIndex % 2 == 0) {
                aEvenCounter.incrementAndGet();
            }
            aAllCounter.incrementAndGet();
        });

        System.out.println("Time=" + (System.currentTimeMillis() - t));
        assertEquals(INT_NUMBERS / 2, aEvenCounter.get());
        assertEquals(INT_NUMBERS, aAllCounter.get());

        aEvenCounter.set(0);
        aAllCounter.set(0);

        t = System.currentTimeMillis();
        SharedForkJoinExecutor.get().execute(() -> IntStream.range(0, INT_NUMBERS).parallel().forEach(theIndex -> {
            if (theIndex % 2 == 0) {
                aEvenCounter.incrementAndGet();
            }

            aAllCounter.incrementAndGet();
        }));

        System.out.println("Time1=" + (System.currentTimeMillis() - t));


        SharedForkJoinExecutor.get().awaitQuiescence(10, TimeUnit.HOURS);

        System.out.println("Time2=" + (System.currentTimeMillis() - t));
        assertEquals(INT_NUMBERS / 2, aEvenCounter.get());
        assertEquals(INT_NUMBERS, aAllCounter.get());
    }
}

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
Alexander
  • 11
  • 6
  • I ran your test (Java 8 on linux) and it printed: Time=40 Time1=1 Time2=38. What huge performance difference are you talking about? – Alexei Kaigorodov Aug 11 '19 at 05:13
  • I also counted where the tasks were executed at the second run. 56% were executed on the ForkJoinPool.commonPool(), 31% on your SharedForkJoinExecutor and 12.5% on the main thread via call to `awaitQuiescence()`. – Alexei Kaigorodov Aug 11 '19 at 05:19
  • permanent access to common `AtomicInteger`s creates huge contention, so your test cannot measure parallelism accurately. – Alexei Kaigorodov Aug 11 '19 at 05:38
  • I run this test on the macOs via IDE – Alexander Aug 11 '19 at 09:27
  • JUnit is not a benchmark tool. You should read [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/q/504103/2711488). Running two operations that way one after the other is not a valid comparison of the two. Further, `System.currentTimeMillis()` is not a reliable measuring method for elapsed time. – Holger Aug 12 '19 at 16:08
  • Yes, I know, thanks. I am aware about JMH. But this is still a question what happens in the middle of the test during that 5 seconds – Alexander Aug 12 '19 at 16:15
  • It might be ForkJoin pool's initialization or something else, the question is mostly about it. Streams and contention are not so important here, the logic of the test is trivial. – Alexander Aug 12 '19 at 16:18
  • Of course, you are measuring the initialization of the fork/join pool, as you’re initializing it within the measured execution. In this specific case, the initialization of the `SharedForkJoinExecutor` class will happen on the first `SharedForkJoinExecutor.get()` invocation, which is right after the `t = System.currentTimeMillis();` statement. However, I don’t know what your system is doing in these five seconds (did I mention that `System.currentTimeMillis()` is not a reliable measuring method for elapsed time?). I can’t reproduce your times, not even remotely. – Holger Aug 26 '19 at 08:44

0 Answers0