4

I was going through following code in the asyncio doc.

import asyncio

async def tcp_echo_client(message):
  reader, writer = await asyncio.open_connection(
    '127.0.0.1', 8888)

  print(f'Send: {message!r}')
  writer.write(message.encode())

  data = await reader.read(100)
  print(f'Received: {data.decode()!r}')

  print('Close the connection')
  writer.close()
  await writer.wait_closed()

asyncio.run(tcp_echo_client('Hello World!'))

However I am now able to understand why reader.read is awaitable but writer.write is not ? Since they are both I/O operations write method should also be awaitable right ?

user4815162342
  • 141,790
  • 18
  • 296
  • 355

1 Answers1

4

However I am now able to understand why reader.read is awaitable but writer.write is not ? Since they are both I/O operations write method should also be awaitable right ?

Not necessarily. The fundamental asymmetry between read() and write() is that read() must return actual data, while write() operates purely by side effect. So read() must be awaitable because it needs to suspend the calling coroutine when the data isn't yet available. On the other hand, write() can be (and in asyncio is) implemented by stashing the data in some buffer and scheduling it to be written at an opportune time.

This design has important consequences, such as that writing data faster than the other side reads it causes the buffer to bloat up without bounds, and that exceptions during write() are effectively lost. Both issues are fixed by calling writer.drain() which applies backpressure, i.e. writes out the buffer to the OS, if necessary suspending the coroutine in the process. This is done until the buffer size drops beneath a threshold. The write() documentation advises that "calls to write() should be followed by drain()."

The lack of backpressure in write() is a result of asyncio streams being implemented on top of a callback-based layer in which a non-async write() is much more convenient to use than a fully asynchronous alternative. See this article by Nathaniel J Smith, the author of trio, for a detailed treatment of the topic.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • 3
    Note: Python 3.8 will have `awrite()` and `aclose()` asynchronous stream methods to avoid `write()` / `drain()` and `close()` / `wait_closed()` confusion. – Andrew Svetlov Oct 03 '18 at 08:17
  • @AndrewSvetlov That's fantastic! It would also be very useful to have a `flush()` coroutine, the asyncio equivalent of `file.flush`. The idea is to simply flush the stream buffer to the OS, just like drain() but without the watermarks. – user4815162342 Oct 03 '18 at 08:42
  • I doubt if `flush()` is useful because flushing has no delivery guarantee for distributed systems. Peer still can don't receive *flushed* data. The file system is different: flushed buffers are in kernel memory, they will be saved on disk if the box is not rebooted (very unlikely). `fsync` provides an even stronger guarantee. Anyway if you still want `flush()` -- please create an issue on bugs.python.org for discussion – Andrew Svetlov Oct 03 '18 at 10:34
  • @AndrewSvetlov True, `flush()` offers no delivery guarantees, but it's still useful to transfer the data to the OS. This is the same as with flushing a regular file, where the data is not synced but passed on to the kernel. A socket also has a kernel buffer, which is discussed by Nathaniel, who goes on to argue that asyncio's system of watermarks on top of kernel's buffering is superfluous and leads to bufferbloat. (I searched async-sig archives, but couldn't find a refutation of that claim.) But since `flush()` is orthogonal to `drain()`, I'll just file a BPO issue. – user4815162342 Oct 03 '18 at 10:57
  • I used to disable watermarks in aiohttp server. As the result, I got 10-15% slowdown IIRC. – Andrew Svetlov Oct 03 '18 at 17:45
  • @AndrewSvetlov I didn't know that. Just to make it clear, at that point you had transport watermarks turned off and [your own buffering](https://github.com/asyncio-docs/asyncio-doc/issues/17#issuecomment-304178161), and it was slower than the same buffering code with the watermarks on? Maybe the `TCP_NODELAY` socket option automatically set by asyncio invalidates Nathaniel's argument. – user4815162342 Oct 03 '18 at 18:19
  • Now we use `transp.set_write_buffer_limits()` with default parameters I guess. – Andrew Svetlov Oct 04 '18 at 08:30
  • @user4815162342 Thanks for the wonderful answer and the article link. Cleared a lot of things. – Mukul Chakravarty Oct 04 '18 at 08:54