Performance of scala parallel collection processing

Question

I have scenarios where I will need to process thousands of records at a time. Sometime, it might be in hundreds, may be upto 30000 records. I was thinking of using the scala's parallel collection. So just to understand the difference, I wrote a simple pgm like below:

object Test extends App{
  val list = (1 to 100000).toList
  Util.seqMap(list)
  Util.parMap(list)
}

object Util{
  def seqMap(list:List[Int]) = {
    val start = System.currentTimeMillis
    list.map(x => x + 1).toList.sum
    val end = System.currentTimeMillis
    println("time taken =" + (end - start))
    end - start
  }
  def parMap(list:List[Int]) = {
    val start = System.currentTimeMillis
    list.par.map(x => x + 1).toList.sum
    val end = System.currentTimeMillis
    println("time taken=" + (end - start))
    end - start
  }
}

I expected that running in parallel will be faster. However, the output I was getting was

time taken =32
time taken=127

machine config :

Intel i7 processor with 8 cores
16GB RAM
64bit Windows 8

What am I doing wrong? Is this not a correct scenario for parallel mapping?

dhg · Accepted Answer · 2015-02-13T11:16:03.057

10

The issue is that the operation you are performing is so fast (just adding two ints) that the overhead of doing the parallelization is more than the benefit. The parallelization only really makes sense if the operations are slower.

Think of it this way: if you had 8 friends and you gave each one an integer on a piece of paper and told them to add one, write the result down, and give it back to you, which you would record before giving them the next integer, you'd spend so much time passing messages back and forth that you could have just done all the adding yourself faster.

ALSO: Never do .par on a List because the parallelization procedure has to copy the entire list into a parallel collection and then copy the whole thing back out. If you use a Vector, then it doesn't have to do this extra work.

edited Feb 13 '15 at 11:16

answered Feb 13 '15 at 11:10

dhg

52,383
8
123
144

"ALSO: Never do .par on a List ". Let say, I am getting the thousand of records as a list, does it mean than converting to vector first and then doing par on it is faster? – Yadu Krishnan Feb 13 '15 at 12:41
3

If you call .par on a List, it is forced to copy the whole list before it can do anything, and then it has to copy it back so that it can return a List to you. You doing the conversion manually probably won't make it any better, but if you use a Vector, then it doesn't have to do either copy. I would actually recommend against every using List; Vector is apparently better for basically everything. – dhg Feb 13 '15 at 20:26
Just one more doubt. What if I am having some db queries inside a map operation. Parallelizing that map will help it positively? For eg: val someVal list.par.map { item=> //do some operations with the 'item' and get some result 'res' //query the database with the result 'res' } I have a few scenarios where I need to generate some report. The logic inside the maps is little complex, but after getting the complex logic, I need to call database with that result. So was wondering, will there be any pblms in this case using parallel map. (note: i am not using connection pooling now.) – Yadu Krishnan Feb 16 '15 at 04:41

elm · Answer 2 · 2015-02-13T11:29:16.207

The overhead in parallelizing the list proves more time-consuming than the actual processing of the x + 1 operations sequentially.

Yet consider this modification where we include an operation that elapses over 1 millisecond approximately,

case class Delay() {
  Thread.sleep(1)
}

and replace

list.map(x => x + 1).toList.sum

with

list.map(_ => Delay()).toList

Now for val list = (1 to 10000).toList (note 10000 instead of 100000), in a quadcore 8GB machine,

scala> Util.parMap(list)
time taken=3451
res4: Long = 3451

scala> Util.seqMap(list)
time taken =10816
res5: Long = 10816

We can infer (better, guess) that for large collections with time-consuming operations, the overhead of parallelizing a collection does not significantly affect the elapsed time, in contrast with a sequential collection processing.

score 1 · Answer 3 · answered Feb 13 '15 at 11:11

If you are doing benchmarks, consider using something like JMH to avoid all the possible problems you might encounter, if you are measuring it in the way your program shows. For example, JIT may change your results dramatically, but only after some iterations.

In my experience parallel collections are normally slower, if the input is not large enough: If the input is small the initial split and the "putting together" at the end does not pay off.

So benchmark again, using lists of different sizes (try 30 000, 100 000, and 1 000 000).

Moreover if you do numerical processing, consider using Array (instead of List) and while (instead of map). These are "more native" (= faster) to the underlying JVM whereas in your case you are possibly measuring the performance of the garbage collector. As for Array you could store the result of the operation "in place".

score 0 · Answer 4 · answered Feb 13 '15 at 11:49

0

Parallel collections initialize threads before performing operation that tooks some time.

So when you perform operations by parallel collections with low number of elements or operations takes low time parallel collections will performs slower

answered Feb 13 '15 at 11:49

Lucky Libora

140
1
3

Performance of scala parallel collection processing

4 Answers4

Linked