Future call in for comprehension in Scala - how do I process them sequentially?

Question

I have a code similar to this:

for (n <- 1 to 1000) {
  someFuture map {
    // some other stuff
}

This is a basic piece of code and works fine. However, somefuture does some queries to a database, and the database cannot receive several queries in parallel, which is what happens before (it spawns a lot of threads executing somefuture as one would expect).

Ideally, I would like to do it sequentially (i.e. call someFuture when n=1, do some processing, call someFuture when n=2, do some processing, etc). I thought about using some blocking method (from Await) but this happens inside an actor, so blocking is not a good idea. Another idea was creating a fixed thread pool for this particular future call, but sounds like overkill. What should I do instead?

Update: I have found this answer which suggests creating a fixed thread pool as I thought. Still, is this the right way to do it?

score 1 · Answer 1 · answered May 31 '17 at 17:08

1

You want to map or flatmap a single future.

scala> val f = Future(42)
f: scala.concurrent.Future[Int] = Future(Success(42))

scala> (1 to 10).foldLeft(f)((f,x) => f.map(i => i + 1))
res1: scala.concurrent.Future[Int] = Future(<not completed>)

scala> res1
res2: scala.concurrent.Future[Int] = Future(Success(52))

scala> (1 to 10).foldLeft(f)((f,i) => {
     | println(i)
     | f.map(x => x+i) })
1
2
3
4
5
6
7
8
9
10
res4: scala.concurrent.Future[Int] = Future(<not completed>)

scala> res4
res5: scala.concurrent.Future[Int] = Future(Success(97))

answered May 31 '17 at 17:08

som-snytt

39,429
2
47
129

This code is not stack-safe. You'll get `StackOverflowError` if a number of futures is big. – simpadjo May 31 '17 at 17:56
@simpadjo What did you try? For me, `(1 to Int.MaxValue)` exhausts heap, not stack. – som-snytt May 31 '17 at 20:57
Here is a good article which covers futures stack safety problem https://alexn.org/blog/2017/01/30/asynchronous-programming-scala.html#h3-3 . I faced it myself in production as well. – simpadjo May 31 '17 at 21:36
I can't run your code now but I think that you get OOM at the moment of construction of the sequence even before working with futures. – simpadjo May 31 '17 at 21:40
@simpadjo it matters exactly what you're trying. Also, future schedules immediately (which people complain about) and a mapped future runs on that completion. Orthogonally, you can express the arbitrary sequence as a loop with a var, but the notion of repeatedly mapping is the same. – som-snytt May 31 '17 at 23:10
You are right. I checked your code. I previously had `StackOverflowError` when added callback to a `Future` in recursive method. You are using `fold` and it is safe. – simpadjo Jun 01 '17 at 09:14

score 0 · Answer 2 · answered May 31 '17 at 16:43

0

one approach would be to send the message to an actor that processes the data. Since actor processes messages one by one you would execute your queries sequentially not in parallel.

for (n <- 1 to 1000) {
  someFuture map {
      x => actor ! x
    }
}

answered May 31 '17 at 16:43

rogue-one

11,259
7
53
75

The query happens in `someFuture`, therefore what's inside the `map`ped `Future` doesn't really matter much, right? – Bob Dem May 31 '17 at 16:45
yes, if the query is running in someFuture then it means all the query execution have been triggered even before it comes to the for comprehension block of scala. So in your case you will have to send the queries to your actor before the for comprehension block. – rogue-one May 31 '17 at 16:49

score 0 · Answer 3 · answered May 31 '17 at 17:41

Probably the ideal long-term way to handle this is to use a database access layer that does connection pooling. Most frameworks like play or slick have some preferred way of handling this, or if you want something standalone, DBCP might be a good option. I think most of these should have a "natural" way to limit the number of connections to a fixed size and block if no connections in the pool are available, which would limit your parallelism.

Other than introducing some other dependency like that, using a thread pool execution context as you mentioned is definitely the way to go. It's not overkill; it's very common, and will be much less hacky than any other way of handling this.

I've done multiple ECs, so I don't want to say it's not viable, but for the simple use case of chained execution, I don't see an advantage over just future.map or similar, which means run this when the future completes. There could be a big diff how work is apportioned, depending on the use case. — som-snytt, May 31 '17 at 21:10

Future call in for comprehension in Scala - how do I process them sequentially?

3 Answers3