2

Consider the following two snippets where first wraps scalaj-http requests with Future, whilst second uses async-http-client

Sync client wrapped with Future using global EC

object SyncClientWithFuture {
  def main(args: Array[String]): Unit = {
    import scala.concurrent.ExecutionContext.Implicits.global
    import scalaj.http.Http
    val delay = "3000"
    val slowApi = s"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
    val nestedF = Future(Http(slowApi).asString).flatMap { _ =>
      Future.sequence(List(
        Future(Http(slowApi).asString),
        Future(Http(slowApi).asString),
        Future(Http(slowApi).asString)
      ))
    }
    time { Await.result(nestedF, Inf) }
  }
}

Async client using global EC

object AsyncClient {
  def main(args: Array[String]): Unit = {
    import scala.concurrent.ExecutionContext.Implicits.global
    import sttp.client._
    import sttp.client.asynchttpclient.future.AsyncHttpClientFutureBackend
    implicit val sttpBackend = AsyncHttpClientFutureBackend()
    val delay = "3000"
    val slowApi = uri"http://slowwly.robertomurray.co.uk/delay/${delay}/url/https://www.google.co.uk"
    val nestedF = basicRequest.get(slowApi).send().flatMap { _ =>
      Future.sequence(List(
        basicRequest.get(slowApi).send(),
        basicRequest.get(slowApi).send(),
        basicRequest.get(slowApi).send()
      ))
    }
    time { Await.result(nestedF, Inf) }
  }
}

The snippets are using

The former takes 12 seconds whilst the latter takes 6 seconds. It seems the former behaves as if it is CPU bound however I do not see how that is the case since Future#sequence should executes the HTTP requests in parallel? Why does synchronous client wrapped in Future behave differently from proper async client? Is it not the case that async client does the same kind of thing where it wraps calls in Futures under the hood?

Mario Galic
  • 47,285
  • 6
  • 56
  • 98
  • AFAIK, it should be considered **IO** bounded, since the thread won't be doing anything but waiting. A proper async client would not block a real compute thread, so the ec should be free of making more calls, whereas the sycn client will block the threads. - BTW, you may wan to use `traverse` so the petitions are not started when the list is defined and you may want to add a thirds snippet using the sync client but wrapped on a `scala concurrent.blocking` block. – Luis Miguel Mejía Suárez Jul 26 '20 at 14:16
  • @LuisMiguelMejíaSuárez Is it not the case that behaviour is determined by the kind of execution context being used, and since we are using the same `global` EC for both clients, should the behaviour not be the same? – Mario Galic Jul 26 '20 at 14:19
  • Why? Is like saying that two similar functions but with different complexities should execute the same if we run both using the **JVM**. - BTW, what is the nature of `global` in your environment? Are you running in an environment with only one thread? – Luis Miguel Mejía Suárez Jul 26 '20 at 14:49
  • @LuisMiguelMejíaSuárez I am running these two examples just in scastie (there are links in the OP). So the async client effectively does something like `Future(blocking(task))` instead of just `Future(task)`, and it is this characteristic of it that makes it truly async? – Mario Galic Jul 26 '20 at 15:12
  • 1
    Ah then yes, I believe **Scastie** uses a single thread for its `global`. - A truly async client should do something better than `blocking`, because `blocking` _(when works, for example for a single-threaded ec it does nothing)_ will just create a new thread to be blocked. Whereas a truly async client should not block anything but rather work using callbacks. Take a look to the event loop in **JS**, you can not block it and you can nor create a new one to block, everything has to be truly asynchronous. – Luis Miguel Mejía Suárez Jul 26 '20 at 15:51
  • 1
    @LuisMiguelMejíaSuárez `println(scala.concurrent.ExecutionContext.Implicits.global)` gives `parallelism = 6` in scastie. – Mario Galic Jul 26 '20 at 22:24
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/218662/discussion-between-mario-galic-and-luis-miguel-mejia-suarez). – Mario Galic Jul 27 '20 at 07:08

1 Answers1

2

Future#sequence should execute the HTTP requests in parallel?

First of all, Future#sequence doesn't execute anything. It just produces a future that completes when all parameters complete. Evaluation (execution) of constructed futures starts immediately If there is a free thread in the EC. Otherwise, it simply submits it for a sort of queue. I am sure that in the first case you have single thread execution of futures.

println(scala.concurrent.ExecutionContext.Implicits.global) -> parallelism = 6

Don't know why it is like this, it might that other 5 thread is always busy for some reason. You can experiment with explicitly created new EC with 5-10 threads.

The difference with the Async case that you don't create a future by yourself, it is provided by the library, that internally don't block the thread. It starts the async process, "subscribes" for a result, and returns the future, which completes when the result will come.

Actually, async lib could have another EC internally, but I doubt.

Btw, Futures are not supposed to contain slow/io/blocking evaluations without blocking. Otherwise, you potentially will block the main thread pool (EC) and your app will be completely frozen.

Artem Sokolov
  • 810
  • 4
  • 8