2

I want to test how much faster Java8 parallel stream so I wrote a program .The program count the number of prime numbers in a given list of numbers. The program counts prime numbers in there ways:

  1. using for loop;
  2. by using lambda expression;
  3. by using lambda expression(parallel stream).

Before executing the program I was expecting that parallel stream version should be faster. But the result is

Total prime no found 664579 in 4237 mili sec ----for loop version
Total prime no found 664579 in 2440 mili sec ----parallel stream
Total prime no found 664579 in 2166 mili sec ----lambda expression

My doubt is why parallel version is slower then lambda version

List<Integer> numbers = new ArrayList<>();
        for (int i = 0; i < 10000000; i++) {
            numbers.add(i);
        }
        Stopwatch stopwatch = Stopwatch.createStarted();
        int counter = 0;
        for (int number : numbers) {
            if (isPrime(number)) {
                counter++;
            }
        }
        stopwatch.stop();
        System.out.println("Total prime no found " + counter + " in " + stopwatch.elapsed(TimeUnit.MILLISECONDS) + " mili sec");

        stopwatch = Stopwatch.createStarted();
        long count1 = numbers.parallelStream().filter(n -> isPrime(n)).count();
        stopwatch.stop();
        System.out.println("Total prime no found " + count1 + " in " + stopwatch.elapsed(TimeUnit.MILLISECONDS) + " mili sec");

        stopwatch = Stopwatch.createStarted();
        long count2 = numbers.stream().filter(n -> isPrime(n)).count();
        System.out.println("Total prime no found " + count2 + " in " + stopwatch.elapsed(TimeUnit.MILLISECONDS) + " mili sec");
        stopwatch.stop();

The above program uses google Guava library to calculate time elapsed.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
Ashok_Pradhan
  • 1,159
  • 11
  • 13
  • 4
    while doing benchmarking you should also take care of warmup period – jmj Mar 28 '14 at 03:53
  • 3
    Also: `count1` and `count2` are identical... – Brian S Mar 28 '14 at 03:54
  • I changed it but the results a inconsistent – Ashok_Pradhan Mar 28 '14 at 06:27
  • Total prime no found 664579 in 4178 mili sec --loop
    Total prime no found 664579 in 2597 mili sec --parallel stream
    Total prime no found 664579 in 4187 mili sec --lambda expression
    Total prime no found 664579 in 4198 mili sec --loop
    Total prime no found 664579 in 4690 mili sec --in parallel stream
    Total prime no found 664579 in 4195 mili sec --in lambda expression
    Total prime no found 664579 in 4252 mili sec --loop
    Total prime no found 664579 in 2455 mili sec --parallel
    Total prime no found 664579 in 4555 mili sec --lambda expression
    – Ashok_Pradhan Mar 28 '14 at 06:28
  • Your benchmarking methodology is a bit simplistic. You should use a framework such as jmh. – assylias Mar 28 '14 at 08:58
  • Include the code of `isPrime` so people can corroborate your testimony. As it is now, we can only take your word for it. – Marko Topolnik Mar 28 '14 at 09:33
  • @MarkoTopolnik, At the time I wrote my comment, both lines were identical save for the variable name. – Brian S Mar 28 '14 at 10:17
  • See here: [How do I write a correct micro-benchmark in Java?](http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) – Balder Mar 29 '14 at 10:01

2 Answers2

4

Most likely, the problem is that during your test the JIT compiler (re-)compiles code. That means that your comparison isn't fair, because the later tests benefit from compilations caused by the earlier tests.

This is a frequently made mistake of micro benchmarks. There are a lot of articles explaining the problem, e.g. Robust Java benchmarking. If I add some code to warmup the JIT first, the results are expected. My main method looks as follows:

public static void main(String... args) {
    System.out.println("Warmup...");
    for (int i = 0; i < 5000; ++i) {
        run(Demo::testLoop, 5000);
        run(Demo::testStream, 5000);
        run(Demo::testParallel, 5000);
    }
    System.out.println("Benchmark...");
    int bound = 10000000;
    System.out.printf("Loop:     %s\n", run(Demo::testLoop, bound));
    System.out.printf("Stream:   %s\n", run(Demo::testStream, bound));
    System.out.printf("Parallel: %s\n", run(Demo::testParallel, bound));
}

And the output looks as follows:

Loop:     7.06s (664579)
Stream:   7.06s (664579)
Parallel: 3.84s (664579)

If you pass the option -XX:+PrintCompilation to the Java VM, you can see when and where the JIT kicks in, and that almost all compilations now happen during the warmup phase.

Note that parallel streams are not the best solution for this kind of parallelization, because the complexity of isPrime depends on the value. That means, the first half of the sequence requires significantly less work than the second half (and so on).

For reference, here are the remaining methods of my implementation:

public static boolean isPrime(int value) {
    for (int i = 2; i * i <= value; ++i)
        if (value % i == 0) return false;
    return true;
}

public static long testLoop(int bound) {
    long count = 0;
    for (int i = 2; i < bound; ++i)
        if (isPrime(i)) ++count;
    return count;
}

public static long testStream(int bound) {
    return IntStream.range(2, bound).filter(Demo::isPrime).count();
}

public static long testParallel(int bound) {
    return IntStream.range(2, bound).parallel().filter(Demo::isPrime).count();
}

public static String run(IntToLongFunction operation, int bound) {
    long start = System.currentTimeMillis();
    long count = operation.applyAsLong(bound);
    long millis = System.currentTimeMillis() - start;
    return String.format("%4.2fs (%d)", millis / 1000.0, count);
}
nosid
  • 48,932
  • 13
  • 112
  • 139
  • Can you elaborate what is JIT warm up ,If I am not wrong you are doing warm up in side main method by that for loop . thanks for your detailed explanation – Ashok_Pradhan Mar 29 '14 at 07:47
  • I have added a link to an article that explains _warmup_. You can use the command line option `-XX:+PrintCompilation` to see what's going on. – nosid Mar 29 '14 at 09:54
0

It is a well known fact that the F/J framework needs to warm up. The code is written in such a fashion that it only performs well when compiled. You also have to consider the time it takes to create threads. Having a warm up period in a production environment is subjective.

There is a lot of awful code in the framework to overcome this sluggish behavior when first starting.

edharned
  • 1,884
  • 1
  • 19
  • 20