2

With Scala, how can I convert an Iterable[t] to a Future[Seq[T]]?

Basically I want to read all the values returned by a query on Apache Ignite, that returns an IgniteCursor.

I want to read from that cursor in a non blocking way.

I can write:

val iterable = cursor.asScala
val result = iterable.toList

Future{
  result
}

But I think that this code is blocking, not async. I'm right?

Does trasforming an Iterable to a Future[Seq] make sense?

UPDATE

My goal is to get a small list of records without blocking the caller thread or a worker thread because I will receive many concurrent calls.

My Iterable is actually an IgniteCursor so internally I suppose it will perform some network/database operations. Usually there are methods to perform these operations asyncronously. For example to read a single value I can use getAsync instead of get.
For the cursor I just have getAll function, so my idea was to use the Iterable in a smart way.
My understanding is that when using async methods threads are not blocked but they are free to perform other tasks until network operation is complete. I would expect a function that returns a Future/IgniteFuture or a callback.
There is a way to fetch all the records without keep a thread blocked?

Finally to correctly free up resources I need to call close function. If I write Future(cursor.asScala.toList) when should the close method be called? On the onComplete of the Future?

Another easy solution is to write Future{cursor.getAll.asScala} but I suppose that internally a worker thread will be blocked to wait for all the records.

Maybe I missing something?

UPDATE 2

In other words there is a way to fetch a list of records from Ignite using "Asynchronous Non-Blocking IO"?

Davide Icardi
  • 11,919
  • 8
  • 56
  • 77
  • 1
    Why not `val result = Future(cursor.asScala.toList)` ? – jwvh Sep 22 '18 at 00:04
  • 1
    Note that with a `Future[Seq[T]]` you'd not block the caller, but the Future would not complete until all data has been read. For large data, you may want to look at some way to "stream" the data (so that you can start reading as soon as the first piece of data arrives). – Thilo Sep 22 '18 at 00:07
  • Where exactly is it blocking? Don't answer me, just find the place, and put `Future { ... }` around it. Now, it's not blocking. – Dima Sep 22 '18 at 01:38
  • @jwvh See my update section. Using `Future(cursor.asScala.toList)` I don't know when to call `close` of the `IgniteCursor`. Plus my concern is that I will block anyway a worker thread... I'm right? – Davide Icardi Sep 22 '18 at 08:51
  • @Thilo See my update section. In my use case I have just a few records, but a lot of concurrent calls. I can probably "stream" data, but I don't know where to call `IgniteCursor.close` in this way and my primary concern is to handle a lot of concurrent calls. – Davide Icardi Sep 22 '18 at 08:56
  • @Dima See my update section. My concern was not only to don't block the caller thread but also to efficently use thread pool to support a large number of concurrent requests. – Davide Icardi Sep 22 '18 at 08:57
  • 1
    @DavideIcardi; I don't quite understand your concern about blocked worker threads. That's what they're for. We launch threads so that _they_ will block on slow IO routines, leaving the main thread free to do other useful things. – jwvh Sep 22 '18 at 09:10
  • @jwvh What's the point of having `fAsync` function if I can just write `Future{f}`? My understanding is that `fAsync` uses some kind of "non-blocking IO" feature, that allow to better use threads. See for example this discussion: https://stackoverflow.com/a/40049018/209727 My question is: how can I query records from Ignite using "non-blocking IO"? – Davide Icardi Sep 22 '18 at 09:32
  • Do you have a link to this `fAsync` function? If that uses native non-blocking IO, then it would indeed be better than blocking a worker thread (which also is not really a problem since that is what it is for). But then you need to translate this async result into a Scala Future somehow (for example by using a Promise that you can complete from a callback -- if such a mechanism if available with `fAsync` or by polling for completion from a timer thread). – Thilo Sep 22 '18 at 11:18
  • I'd probably just do `Future(cursor.getAll.toScala)(dedicatedExecutorWithAppropriateNumberOfThreadsForControlledConcurrency)`. This way `getAll` takes care of closing the cursor, and you can configure your executor with however many worker threads make sense to control concurrency levels. – Thilo Sep 22 '18 at 11:22
  • I don't get it. If all you have is `getAll`, then how "smart" can you possibly get with that? You have two choice: either call it inline, or offload to a thread. – Dima Sep 22 '18 at 11:49
  • Note that while `getAll` is not smart, the `Iterable` implemented by `IgniteCursor` **IS**. – alamar Sep 27 '18 at 12:17

2 Answers2

2

Does trasforming an Iterable to a Future[Seq] make sense?

That depends on what your goals are.

Iterable is lazy and memory efficient. Seq is eager and will take up more memory. By going from Iterable to Seq you are essentially saying: "I don't want to see any data until everything is ready and loaded into memory."

By making it a Future[Seq] you are essentially saying: "I'll come back later after (hopefully) all the data elements are ready and loaded into memory."

jwvh
  • 50,871
  • 7
  • 38
  • 64
  • Thanks. I have added an update section to better explain my problem. My use case is "I'll come back later after (hopefully) all the data elements are ready and loaded into memory." But I also want to scale well without keep threads blocked. I would expect to have some built-in async function or callbacks. – Davide Icardi Sep 22 '18 at 09:00
1

I don't get it. If all you have is getAll, then how "smart" can you possibly get with that? You have two choice: either call it inline, or offload to a thread.

One thing you can do is wrap the call around into blocking:

Future { 
   blocking { 
      cursor.getAll.asScala
   }
}

This tells the executor, that the worker thread will block, so it will not count it towards the thread pool limit. Blocking by itself isn't bad. It's only bad when it leads to thread starvation.

But be careful with this. If your cursors a "heavy", and you have a lot of them running in parallel, you'll have problems, other than blocked threads. You could run out of memory, for example, because you'll be trying to fit all of the results there at once. Or you'll saturate the IO.

Dima
  • 39,570
  • 6
  • 44
  • 70