1

whichever way I think about asynchronicity, I still come up with some sort of concurrency.

This guy here says that asynchronicity can have two flavors:

  • simulated asynchronicity ( let me call it that way)- where a thread is spawn for the async execution of some operations. To me this is a fake-asynchronicity and it's similar to concurrency. I don't see any real benefits here.

  • hardware supported asynch - where the request is just forwarded to the Hardware (like the Hard-disk or the Network Card) and the control of execution is immediately returned to the CPU. When the IO operation is ready, the CPU is notified and a call back is executed. This seems ok if you think about one single IO request, but if I try to extend the example for multiple IO requests then I still arrive at concurrency only that the concurency has been forwarded to the Hardware. Here's a diagram for two async IO calls:

    1. CPU ----- io async req 1 -----> Hardware
    2. CPU <------ returns the control (no data) ------- Hardware
    3. CPU ----- io async req 2 ------> Hardware
    4. CPU <------ return the control (no data) ------- Hardware
    5. CPU executes other operations while the Hardware executes two IO tasks
    6. CPU <------- data for req 1 ------- Hardware
    7. CPU executes the callback
    8. CPU executes other operations
    9. CPU <-------- data for req 2 ------- Hardware
    10. CPU executes the callback

As you can see, at line 5, the hardware handles two tasks simultaneously so the concurrency has been transferred to the hardware. So, as I said, whichever way I think about asynchronicity I still come up with some sort of concurrency, of course this time is not the CPU that handles it but the IO-Hardware.

Am I wrong ?
Does the IO hardware accept concurrency?
If yes, is the concurrency offered by the IO-hardware much better than that of the CPU? If not, then the hardware is executing synchronously multiple IO operations, in which case, I don't see the benefits of asynchronicity vs. concurrency.

Thanks in advance for your help.

Community
  • 1
  • 1
humbletrader
  • 396
  • 2
  • 10

2 Answers2

2

Async IO mainly is about not having to have a thread exist for the duration of the IO. Imagine a server waiting on 1000000 TCP connections for data to arrive. With one thread per connection that is a lot of memory burned.

Instead, a threadless async IO is issued and it's just a small data structure. It'S a registration with the OS that says "If data arrives call me back".

How IOs map to hardware operations varies. Some hardware might have concurrency built-in. My SSD certainly has because it has multiple independent flash chips on it. Other hardware might not be able to process multiple IOs concurrently. Older magnetic disks did not do that. Simple NICs have no concurrency. Here, the driver or OS will serialize requests.

But that has nothing to do with how you initiate the IO. It's the same for thread-based and threadless IO. The driver and the hardware can't tell the difference usually (or don't care).

Async IO is about having less threads. It's not about driving the hardware differently at all.

usr
  • 168,620
  • 35
  • 240
  • 369
  • 1
    I think you missed a key point: Async IO is completely unrelated to hardware concurrency. Async IO is used to save threads and memory, nothing else. Hardware concurrency is meant to increase performance if available. You cannot simulate hardware concurrency on the CPU. There is no trade-off or choice here. – usr Jan 31 '16 at 15:24
  • 1
    Node.js is not popular because of async IO. Other platforms have async IO as well. Node.js is popular *despite* of it because a few years ago async IO forced the code into a callback mess. It's terrible for code quality. Frankly, most people do not understand scalability and do not understand advantages and disadvantages of Node.js. – usr Jan 31 '16 at 15:26
  • 1
    `If the hardware handles the IO operations sequentially, then the proof is not so obvious` What's not obvious about the 1M TCP connections example that I have? You can't spawn 1M threads. – usr Jan 31 '16 at 15:36
  • thanks again for your help. I think I'm getting somewhere. Of course the OS cannot handle 1 million threads, of course that Async IO is saving threads (and implicitly memory) but I thought there are some other gains ( like speed or something else). – humbletrader Jan 31 '16 at 15:56
  • 1
    No, (almost) no speed gains compared to just starting many threads. Might be a small win or a small loss. Usually in the noise. The hardware does the same thing in all cases! – usr Jan 31 '16 at 16:01
  • Even if the two approaches are equal from the speed point of view, that's still better for the async approach because of the memory gains. – humbletrader Jan 31 '16 at 16:08
  • 1
    Yes, that is the reason this is being done. Otherwise, it can be a terrible hit on code quality, complexity and bug-freedom. – usr Jan 31 '16 at 16:10
1

It doesn't seem like you understand asynchronous I/O at all. Here's a typical example of how asynchronous I/O might work:

A thread is running. It wants to send some data over the network. It does an asynchronous network read operation. The call into the operating system reports that no data is ready yet but arranges to notify when some data is ready. The thread keeps running until data arrives at the network card. The network card generates an interrupt, the interrupt handler dispatches to code that notices that there's a pending asynchronous read, it queues an event signalling that the read has completed. Later, the thread is finished with all the work it has to do at that time, so it checks for events. It sees that the read completed, gets the data, processes it, and does another asynchronous read.

The thread may have dozens of asynchronous I/O operations pending at any particular time.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • Thank you for your answer, I think I have a good understanding about async but I might have confused you with the text diagram. Please note that at line 2. and 4. I didn't mean that the Hardware is sending any data back to the CPU. What I meant was : Hardware gives back the control to the CPU. The data for request 1 is sent to the CPU only at line 6. – humbletrader Jan 31 '16 at 13:57