Concurrent for-comprehensions

Question

According to this blog post there's a potential performance issue with for comprehensions. For example:

for {
  a <- remoteCallA()
  b <- remoteCallB()
} yield {
  (a, b)
}

has remoteCallB blocked until remoteCallA is completed. The blog post suggests that we do this instead:

futureA <- remoteCallA()
futureB <- remoteCallB()
for {
  a <- futureA
  b <- futureB
} yield {
  (a, b)
}

which will ensure that the two remote calls can start at the same time.

My question: is the above (and therefore the blog writer) correct?

I've not seen people using this pattern, which has got me wondering whether there are alternative patterns that are generally used instead.

With thanks

As a side note, I don't think "performance issue" is exactly correct, I'd rather say that there's a difference between what was expressed and what was intended. The code, as written, says "do A then B then combine the results", the intent is to say "do A AND B then combine the results" — Angelo Genovese, Jun 04 '15 at 20:27

score 4 · Answer 1 · answered Jun 04 '15 at 19:53

4

The for comprehension

for {
  a <- remoteCallA()
  b <- remoteCallB()
} yield {
  (a, b)
}

Translates to:

remoteCallA().flatmap(a => remoteCallB().map(b => (a,b)))

So, yes, I believe the blogger is correct in that the calls will be sequential, not concurrent, to one another.

answered Jun 04 '15 at 19:53

Angelo Genovese

3,398
17
23

2

This is why people ask for `for (fa = A(); fb = B(); a <- fa` etc. There's also `Future.fold` and friends, when it applies. – som-snytt Jun 04 '15 at 20:23
1

Not sure that there really are any other than the one suggested by the blogger. I know that the Scalaz task provides a greater level of control over when things are executed vs. when they are specified, but the nature of a for comprehension is to use flatmap, and flapmap sort of implies dependency. – Angelo Genovese Jun 04 '15 at 20:23
1

I wasn't aware you could use = inside a for that way, thanks som-snytt – Angelo Genovese Jun 04 '15 at 20:25
1

You can't, but people ask for the feature. It's a convenience, with better scoping. – som-snytt Jun 04 '15 at 20:28
I just came across the ticket. It's an ancient desire. https://issues.scala-lang.org/browse/SI-907 – som-snytt Jun 06 '15 at 00:22

score 2 · Accepted Answer · edited May 23 '17 at 12:14

The common pattern to execute several futures simultaneously is to use zip or Future.traverse. Here are a few examples:

for {
  (a, b) <- remoteCallA() zip remoteCallB()
} yield f(a, b)

This becomes a bit cumbersome when there are more than 2 futures:

for {
  ((a, b), c) <- remoteCall() zip remoteCallB() zip remoteCallC()
} yield (a, b, c)

In those cases you can use Future.sequence:

for {
  Seq(a, b, c) <- 
    Future.sequence(Seq(remoteCallA(), remoteCallB(), remoteCallC()))
} yield (a, b, c)

or Future.traverse, in case you have a sequence of arguments, and want to apply to all of them the same function, which returns a Future.

But both approaches have an issue: if one of the Futures fails early, before the others finish, naturally you may want the resulting Future to fail immediately at that moment. But that's not what happens. The result Future is failed only after all the futures have completed. See this question for details: How to implement Future as Applicative in Scala?

Concurrent for-comprehensions

2 Answers2