Most idiomatic way to mix synchronous, asynchronous, and parallel computation in a scala for comprehension of futures

Question

Suppose I have 4 future computations to do. The first two can be done in parallel, but the third must be done after the first two (even though the values of the first two are not used in the third -- think of each computation as a command that performs some db operation). Finally, there is a 4th computation that must occur after all of the first 3. Additionally, there is a side effect that can be started after the first 3 complete (think of this as kicking off a periodic runnable). In code, this could look like the following:

for {
  _ <- async1     // not done in parallel with async2 :( is there
  _ <- async2     // any way of achieving this cleanly inside of for?
  _ <- async3
  _ =  sideEffect // do I need "=" here??
  _ <- async4
} yield ()

The comments show my doubts about the quality of the code:

What's the cleanest way to do two operations in parallel in a for comprehension?
Is there is a way to achieve this result without so many "_" characters (nor assigning a named reference, at least in the case of sideEffect)
what's the cleanest and most idiomatic way to do this?

score 2 · Accepted Answer · answered May 26 '14 at 03:23

2

You can use zip to combine two futures, including the result of zip itself. You'll end up with tuples holding tuples, but if you use infix notation for Tuple2 it is easy to take them apart. Below I define a synonym ~ for succinctness (this is what the parser combinator library does, except its ~ is a different class that behaves similiarly to Tuple2).

As an alternative for _ = for the side effect, you can either move it into the yield, or combine it with the following statement using braces and a semicolon. I would still consider _ = to be more idiomatic, at least so far as having a side effecting statement in the for is idiomatic at all.

val ~ = Tuple2

for {
  a ~ b ~ c <- async1 zip
               async2 zip
               async3
  d <- { sideEffect; async4 }
} yield (a, b, c, d)

answered May 26 '14 at 03:23

wingedsubmariner

13,350
1
27
52

Neat. Learned a new ~ syntax. Problem with zip is with the failure handling. From docs, `If 'this' future fails, the resulting future is failed with the throwable stored in 'this'. Otherwise, if 'that' future fails, the resulting future is failed with the throwable stored in 'that'.` Instead, order should not matter and throwables should accumulate in a `Seq[Throwable]` in a `Failure`. – ferk86 May 26 '14 at 03:42
@kfer38 Maybe, but that is not how `Try` (the underlying encapsulation of success/failure in futures) works. This isn't just a problem with zip, but with `Future.sequence`, many uses of `flatMap`, etc. You'd have to rewrite `Try`, `Future`, and `Promise` to get the desired behavior. Instead, I would prefer a behavior where it returns whichever failure happens first - This would allow for faster completion in case of errors. I agree that order shouldn't matter. – wingedsubmariner May 26 '14 at 05:35
Designing an API that allows the programmer to specify fail-fast monadic behavior or error accumulation still has some interesting open questions—see for example [my question here](http://stackoverflow.com/q/20065853/334519) or [this mailing list thread](https://groups.google.com/d/msg/scalaz/R-YYRoqzSXk/_wUp0XGQNqwJ). – Travis Brown May 26 '14 at 16:07

score 2 · Answer 2 · answered May 26 '14 at 03:27

for-comprehensions represent monadic operations, and monadic operations are sequenced. There's superclass of monad, applicative, where computations don't depend on the results of prior computations, thus may be run in parallel.

Scalaz has a |@| operator for combining applicatives, so you can use (future1 |@| future2)(proc(_, _)) to dispatch two futures in parallel and then run "proc" on the result of both of them, as opposed to sequential computation of for {a <- future1; b <- future2(a)} yield b (or just future1 flatMap future2).

There's already a method on stdlib Futures called .zip that combines Futures in parallel, and indeed the scalaz impl uses this: https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/std/Future.scala#L36 And .zip and for-comprehensions may be intermixed to have parallel and sequential parts, as appropriate. So just using the stdlib syntax, your above example could be written as:

for {
  _ <- async1 zip async2
  _ <- async3
  _ =  sideEffect
  _ <- async4
} yield ()

Alternatively, written w/out a for-comprehension:

async1 zip async2 flatMap (_=> async3) flatMap {_=> sideEffect; async4}

In scalaz that could be written something like: (async1 |@| async2).tupled >> async3 >> {sideEffect; async4} — pdxleif, Jun 04 '14 at 21:48

score 2 · Answer 3 · answered May 26 '14 at 11:14

Just as an FYI, it's really simple to get two futures to run in parallel and still process them via a for-comprehension. The suggested solutions of using zip can certainly work, but I find that when I want to handle a couple of futures and do something when they are all done, and I have two or more that are independent of each other, I do something like this:

val f1 = async1
val f2 = async2
//First two futures now running in parallel

for {
  r1 <- f1     
  r2 <- f2     
  _ <- async3
  _ =  sideEffect 
  _ <- async4
} yield {
  ...
}

Now the way the for comprehension is structured certainly waits on f1 before checking on the completion status of f2, but the logic behind these two futures is running at the same time. This is a little simpler then some of the suggestions but still might give you what you need.

ferk86 · Answer 4 · 2014-05-25T23:36:57.220

1

Your code already looks structured minus computing futures in parallel.

Use helper functions, ideally writing a code generator to print out helpers for all tuple cases
As far as I know, you need to name the result or assign it _
Example code

Example code with helpers.

import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global

object Example {
  def run: Future[Unit] = {
    for {
      (a, b, c) <- par(
        Future.successful(1),
        Future.successful(2),
        Future.successful(3)
      )
      constant = 100
      (d, e) <- par(
        Future.successful(a + 10),
        Future.successful(b + c)
      )
    } yield {
      println(constant)
      println(d)
      println(e)
    }
  }

  def par[A,B](a: Future[A], b: Future[B]): Future[(A, B)] = {
    for {
      a <- a
      b <- b
    } yield (a, b)
  }

  def par[A,B,C](a: Future[A], b: Future[B], c: Future[C]): Future[(A, B, C)] = {
    for {
      a <- a
      b <- b
      c <- c
    } yield (a, b, c)
  }
}
Example.run

Edit:

generated code for 1 to 20 futures: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parhelpers-scala

parPrinter script: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parprinter-scala

edited May 25 '14 at 23:36

answered May 25 '14 at 23:16

ferk86

2,325
1
23
27

1

`a zip b` already does more or less the same thing as your `par(a, b)`. Also other people's tastes may differ, but I personally find the shadowing in e.g. `a <- a` extremely unpleasant. – Travis Brown May 25 '14 at 23:50
@TravisBrown good point! `zip` doesn't seem to output 3 or more tuples. I thought you'd bring in shapeless. If the types were related, `Future.sequence` could be used instead. Also, the par helpers posted don't handle failures, though it could be done with creating a `Promise` and using `tryComplete`. – ferk86 May 26 '14 at 00:05
the future in `a <- a` can be easily renamed, and become `a <- fa` since code is all generated. Naming many variables becomes tiresome so it wasn't done here – ferk86 May 26 '14 at 00:10
`zip` is nice for two futures! Too bad `zipped` doesn't work with tuples of `Future` objects as it does for lists. – jonderry May 26 '14 at 01:17
I wonder if the scala type system capable of supporting `zipped` for heterogeneous tuples of arbitrary length. – jonderry May 26 '14 at 01:30
Yes tuples are made of heterogeneous elements by definition. See http://stackoverflow.com/questions/9632094/zip-multiple-sequences. Using `zip` however still leaves failure handling unanswered. What I mean is when Futures are composed to compute asynchronously, failure should be returned from all the futures instead of one. Thus, failure should be of a type `Seq[Throwable]` which accumulates all failures from the `par(fa,fb,...)` or any merging function, and this is possible by building a `Future` from a `Promise`. – ferk86 May 26 '14 at 02:15
Here's [a Shapeless-supported `zipN`](https://gist.github.com/travisbrown/00fadaa6b51bf882ab92). – Travis Brown May 26 '14 at 02:44

Most idiomatic way to mix synchronous, asynchronous, and parallel computation in a scala for comprehension of futures

4 Answers4