Why is this not faster using parallel collections?

Question

I just wanted to test the parallel collections a bit and I used the following line of code (in REPL):

(1 to 100000).par.filter(BigInt(_).isProbablePrime(100))

against:

(1 to 100000).filter(BigInt(_).isProbablePrime(100))

But the parallel version is not faster. In fact it even feels a bit slower (But I haven't really measured that).

Has anyone an explanation for that?

Edit 1: Yes, I do have a multi-core processor

Edit 2: OK, I "solved" the problem myself. The implementation of isProbablePrime seems to be the problem and not the parallel collections. I replaced isProbablePrime with another function to test for primality and now I get an expected speedup.

Parallelism is only faster if it lets you get more hardware cranking, and it does have overhead. Is scala set up to make use of the extra cores? — Mike Dunlavey, May 26 '11 at 21:18
I did not know I had to set up anything. Do you have any more information about this? — Plankalkül, May 26 '11 at 21:27
There's no configuration needed here; Scala looks up the number of cores available, and delegates work to an appropriately sized Fork-Join pool. — Scott Morrison, May 26 '11 at 23:41

axel22 · Answer 1 · 2011-05-26T22:11:22.803

6

Both with sequential and parallel ranges, filter will generate a vector data structure - a Vector or a ParVector, respectively.

This is a known problem with parallel vectors that get generated from range collections - transformer methods (such as filter) for parallel vectors do not construct the vector in parallel.

A solution for this that allows efficient parallel construction of vectors has already been developed, but was not yet implemented. I suggest you file a ticket, so that it can be fixed for the next release.

edited May 26 '11 at 22:11

answered May 26 '11 at 21:49

axel22

32,045
9
125
137

I don't think thats the problem. For example if I do something like `(1 to 10000).par.filter(x => {Thread.sleep(1);true})` the parallel version is much faster than the sequential one. So the parallel execution seems to work but it looks like isProbablePrime can not run in more than one thread at once (for whatever reason). – Plankalkül May 27 '11 at 06:20
That's true - it is. However, in that case, the amount of time spent processing each element is much greater than the time needed to construct the `ParVector` (which is currently done sequentially), so you don't notice the time spent creating the vector. The `isProbablePrime` should also be done in parallel, however, the benefits of running it in parallel should be masked by the sequential construction at the end. – axel22 May 27 '11 at 07:51
1

I assume `isProbablePrime` has pretty large portions of synchronized code because another thing that I noticed is that the sequential version maxes out one core but the parallel version distributes the load quite equally. I guess I should just stop using `isProbablePrime` to test parallel stuff :) – Plankalkül May 27 '11 at 11:08
Yes, I just implemented my own isPrime-Function and there the parallel version is much much faster. So in my case it is definitely a problem with `isProbablePrime` and not with the parallel collections of scala itself. – Plankalkül May 27 '11 at 11:14
My guess is that both versions of `isProbablePrime` do a fair amount of work. You question is still legitimate, I would say - if you had some predicate that was computationally fairly cheap, you would "feel" the cost of the `ParVector` construction in the end in terms of performance. – axel22 May 27 '11 at 13:03

Why is this not faster using parallel collections?

1 Answers1