1

I am just reading through some Java 8 code and from what I understand, you are able to iterate over collections using Stream() or parallelStream(). The latter has the advantage of using concurrency to split the task up over modern multicore processors and speed up the iteration (although it does not guarantee the order of the results).

The example code from the Oracle Java tutorial:

double average = roster
        .stream()  /** non-parallel **/
        .filter(p -> p.getGender() == Person.Sex.MALE)
        .mapToInt(Person::getAge)
        .average()
        .getAsDouble();    

double average = roster
        .parallelStream()
        .filter(p -> p.getGender() == Person.Sex.MALE)
        .mapToInt(Person::getAge)
        .average()
        .getAsDouble();

If I had a collection and I did not care about the order that it was processed in (i.e. they all have unique ID's or are unordered anyway or in a presorted state), would it make sense to always use the parallelStream way of iterating over a collection?

Other than when my code is run on a single core machine (at which point I assume the JVM would allocate all the work to the singlecore, thus not breaking my program), are there any drawbacks to using parallelStream() everywhere?

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
Husman
  • 6,819
  • 9
  • 29
  • 47

1 Answers1

0

If you listen to people from Oracle talking about design choices behind Java 8, you will often hear that parallelism was the main motivation. Parallelization was the main driving force behind lambdas, stream API and others. Let's take a look at an example of stream API.

private long countPrimes(int max) {
         return range(1, max).parallel().filter(this::isPrime).count();
}

private boolean isPrime(long n) {
            return n > 1 && rangeClosed(2, (long) sqrt(n)).noneMatch(divisor -> n % divisor == 0);
} 

Here we have method countPrimes that counts number of prime numbers between 1 and max. Stream of numbers is created by a range method. The stream is then switched to parallel mode, numbers that are not primes are filtered out and the remaining numbers are counted.

You can see that stream API allow us to describe the problem in a neat and compact way. Moreover, parallelization is just a matter of calling parallel() method. When we do that, the stream is split into multiple chunks, with each chunk processed independently and with the result summarized at the end. Since our implementation of isPrime method is extremely ineffective and CPU intensive, we can take advantage of parallelization and utilize all available CPU cores. references: http://java.dzone.com/articles/think-twice-using-java-8

KS7
  • 31
  • 1
  • 1
  • 2
  • You just copied and pasted the first part of that article. I read it before I posted my question and it did not answer my question other than saying that a blocked thread will stop the entire task. But for simple examples like in the example code I posted, this should not be a problem (i.e. getting the average age of all males - this should not block my threads unless my application is reading the data over the network and someone pulls out the network cable). – Husman May 30 '14 at 10:34