5

Input the following little sequential program and its parallelized version in Scala REPL:

/* Activate time measurement in "App" class. Prints [total <X> ms] on exit. */
util.Properties.setProp("scala.time", "true")
/* Define sequential program version. */
object X extends App { for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}}
/* Define parallel program version. Note '.par' selector on Range here. */
object Y extends App { for (y <- (1 to 10).par) {Thread.sleep(1000);println(y)}}

Executing X with X.main(Array.empty) gives:

1
2
3
4
5
6
7
8
9
10
[total 10002ms]

Whereas Y with Y.main(Array.empty) gives:

1
6
2
7
3
8
4
9
10
5
[total 5002ms]

So far so good. But what about the following two variations of the program:

object X extends App {(1 to 10).foreach{Thread.sleep(1000);println(_)}}
object Y extends App {(1 to 10).par.foreach{Thread.sleep(1000);println(_)}}

The give me runtimes of [total 1002ms] and [total 1002ms] respectively. How can this be?

Tim Friske
  • 2,012
  • 1
  • 18
  • 28
  • Just speculation: since foreach does not return a sensible value, maybe the program quits before all tasks have been executed. Do you get all 10 outputs? – ziggystar Sep 18 '11 at 15:39
  • Yes, I do in the for-comprehension-, as well as in the functional approach. – Tim Friske Sep 18 '11 at 15:48

2 Answers2

7

This have nothing to do with parallel collections. The problem is hidden in the function literal. You can see it if you let the compiler show the AST (with option -Xprint:typer):

for (x <- (1 to 10)) {Thread.sleep(1000);println(x)}

produces

scala.this.Predef.intWrapper(1).to(10).foreach[Unit](((x: Int) => {
  java.this.lang.Thread.sleep(1000L);
  scala.this.Predef.println(x)
}))

whereas

(1 to 10).foreach{Thread.sleep(1000);println(_)}

produces

scala.this.Predef.intWrapper(1).to(10).foreach[Unit]({
  java.this.lang.Thread.sleep(1000L);
  ((x$1: Int) => scala.this.Predef.println(x$1))
})

There is a little difference. If you want the expected result you have to change the foreach-expression to

(1 to 10).foreach{x => Thread.sleep(1000);println(x)}

But what is the difference? In your code you declare a block to foreach and after executing the block it will return the function to execute. Then this returned function is delivered to foreach and not the block which contains it.

This mistake is often done. It has to do with the underscore literal. Maybe this question helps you.

Community
  • 1
  • 1
kiritsuku
  • 52,967
  • 18
  • 114
  • 136
  • 1
    So never write imperative code in Scala? Who would have thought of this? – ziggystar Sep 18 '11 at 16:32
  • 1
    The rules for the underscore literal must be known in order to use it correct. As I said this is not a problem of parallel collections. – kiritsuku Sep 18 '11 at 16:48
  • Thanks for sharing your insights. Especially the hint about that compiler option was very helpful to see how the expressions get "desugared". I have also read the article you were pointing me to. I read it twice but still having some problems with remembering all these rules. – Tim Friske Sep 18 '11 at 16:54
  • Another excellent answer. That German Scala Tutorial must be really good! :-) – Daniel C. Sobral Sep 19 '11 at 15:42
  • Thanks, Daniel. Nice to hear that from you. ;) – kiritsuku Sep 19 '11 at 16:00
0

An interesting way of thinking about it is that because scala is call-by-value (Call by name vs call by value in Scala, clarification needed) when you hand {Thread.sleep(1000);println()} to foreach you evaluate the the block {Thread.sleep(1000);println()} only once and hand only the resulting println(_) function to foreach. When you do foreach(x => Thread.sleep(1000); println(x)) you are handing Thread.sleep(1000) as well as the println(x) into the function foreach. This is just another way of saying what sschaef already said.

Community
  • 1
  • 1
Andrew Cassidy
  • 2,940
  • 1
  • 22
  • 46