44

What's the point of introducing async for and async with? I know there are PEPs for these statements, but they are clearly intended for language designers, not average users like me. A high-level rationale supplemented with examples would be greatly appreciated.

I did some research myself and found this answer:

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

The author didn't give an example of how the chain might be broken though, so I'm still confused. Furthermore, I notice that Python has async for and async with, but not async while and async try ... except. This sounds strange because for and with just syntax sugars for while and try ... except respectively. I mean, wouldn't async versions of the latter statements allow more flexibility, given that they are the building blocks of the former?

There is another answer discussing async for, but it only covers what it is not for, and didn't say much about what it is for.

As a bonus, are async for and async with syntax sugars? If they are, what are their verbose equivalent forms?

nalzok
  • 14,965
  • 21
  • 72
  • 139
  • 4
    *"`for` and `with` just syntax sugars for `while` and `try ... except`"* — Nope, far from it, they're each their own thing. – deceze Apr 14 '21 at 12:50
  • 1
    `for` and `with` invoke methods on the objects you put in, which are supposed to return certain values immediately. With `async for` and `async with`, these methods can be *async*, allowing them to do some *non-blocking work*. – deceze Apr 14 '21 at 12:52
  • 3
    @deceze Well, the official docs [states](https://docs.python.org/3/reference/compound_stmts.html#with) that the `with` statement "is semantically equivalent to" `try...except...finally`. And you can easily implement a `for` loop with `while` and `next`. Maybe they are not syntax sugars, but they are not *that* different either. – nalzok Apr 14 '21 at 12:59
  • @deceze If the object being iterated over implements `__iter__`/`__next__` then call it; if it implements `__aiter__`/`__anext__` then `await` it. Why introduce a new syntax when we don't need to? – nalzok Apr 14 '21 at 13:03
  • In the manual where it shows the `try..except..finally` equivalent of `with`, note where it calls `enter()` and `exit()`. With `async with`, *those functions* can be async. If you wrote it as `try..except..finally`, you'd write `await enter()` and `await exit()` there. Similar for `for` and `__iter__`. – deceze Apr 14 '21 at 13:05
  • 1
    You need this new syntax, because where else would you put the `await` for async `__enter__`/`__exit__`/`__iter__`/`__next__` if they're implicitly called by the "sugar" `with`/`for` statements? – deceze Apr 14 '21 at 13:07
  • @deceze I actually can get your point, that `async for/with` allows more explicit programs. I was saying we can simply rewrite `enter = type(manager).__enter__` in the manual as `enter = type(manager).__enter__ if hasattr(type(manager), "__enter__" else lambda x: asyncio.run(__enter__(x))`. The example is sloppy but you get the idea. – nalzok Apr 14 '21 at 13:28
  • 2
    No you can't, because that's just a blocking call executing an async function. It will not allow the event loop to execute any other scheduled coroutines, because you're just starting and stopping one event loop in order to resolve one async `enter`. – deceze Apr 14 '21 at 13:31
  • @deceze OK I guess I fully understand now. The reason we need `async for/with` is exactly because they are opaque high-level structures, which we cannot sneak `await` into. With low-level structures like `while` and `except ... for`, we can write whatever we want with maximum flexibity. – nalzok Apr 14 '21 at 13:35
  • 2
    If you want to put it this way, yes. `for` and `with` encapsulate *protocols* for specific patterns involving specific methods, which you *can* replicate "manually" with `while` and `try..except..finally`. But the point is exactly to make those patterns reusable instead of writing a ton of boilerplate every time. And since that boilerplate differs for async versions, you need specific `async` versions of them. – deceze Apr 14 '21 at 13:40
  • perhaps useful for `for`: https://stackoverflow.com/questions/56161595/how-to-use-async-for-in-python – Charlie Parker Sep 02 '21 at 19:06

3 Answers3

38

TLDR: for and with are non-trivial syntactic sugar that encapsulate several steps of calling related methods. This makes it impossible to manually add awaits between these steps – but properly usable async for/with need that. At the same time, this means it is vital to have async support for them.


Why we can't await nice things

Python's statements and expressions are backed by so-called protocols: When an object is used in some specific statement/expression, Python calls corresponding "special methods" on the object to allow customization. For example, x in [1, 2, 3] delegates to list.__contains__ to define what in actually means.
Most protocols are straightforward: There is one special method called for each statement/expression. If the only async feature we have is the primitive await, then we can still make all these "one special method" statements/expression "async" by sprinkling await at the right place.

In contrast, the for and with statements both correspond to multiple steps: for uses the iterator protocol to repeatedly fetch the __next__ item of an iterator, and with uses the context manager protocol to both enter and exit a context.
The important part is that both have more than one step that might need to be asynchronous. While we could manually sprinkle an await at one of these steps, we cannot hit all of them.

  • The easier case to look at is with: we can address at the __enter__ and __exit__ method separately.

    We could naively define a syncronous context manager with asynchronous special methods. For entering this actually works by adding an await strategically:

    with AsyncEnterContext() as acm:
        context = await acm
        print("I entered an async context and all I got was this lousy", context)
    

    However, it already breaks down if we use a single with statement for multiple contexts: We would first enter all contexts at once, then await all of them at once.

    with AsyncEnterContext() as acm1, AsyncEnterContext() as acm2:
        context1, context2 = await acm1, await acm2  # wrong! acm1 must be entered completely before loading acm2
        print("I entered many async contexts and all I got was a rules lawyer telling me I did it wrong!")
    

    Worse, there is just no single point where we could await exiting properly.

While it's true that for and with are syntactic sugar, they are non-trivial syntactic sugar: They make multiple actions nicer. As a result, one cannot naively await individual actions of them. Only a blanket async with and async for can cover every step.

Why we want to async nice things

Both for and with are abstractions: They fully encapsulate the idea of iteration/contextualisation.

Picking one of the two again, Python's for is the abstraction of internal iteration – for contrast, a while is the abstraction of external iteration. In short, that means the entire point of for is that the programmer does not have to know how iteration actually works.

  • Compare how one would iterate a list using for or while:
    some_list = list(range(20))
    index = 0                      # lists are indexed from 0
    while index < len(some_list):  # lists are indexed up to len-1
        print(some_list[index])    # lists are directly index'able
        index += 1                 # lists are evenly spaced
    
    for item in some_list:         # lists are iterable
        print(item)
    
    The external while iteration relies on knowledge about how lists work concretely: It pulls implementation details out of the iterable and puts them into the loop. In contrast, internal for iteration only relies on knowing that lists are iterable. It would work with any implementation of lists, and in fact any implementation of iterables.

Bottom line is the entire point of for – and with – is not to bother with implementation details. That includes having to know which steps we need to sprinkle with async. Only a blanket async with and async for can cover every step without us knowing which.

Why we need to async nice things

A valid question is why for and with get async variants, but others do not. There is a subtle point about for and with that is not obvious in daily usage: both represent concurrency – and concurrency is the domain of async.

Without going too much into detail, a handwavy explanation is the equivalence of handling routines (()), iterables (for) and context managers (with). As has been established in the answer cited in the question, coroutines are actually a kind of generators. Obviously, generators are also iterables and in fact we can express any iterable via a generator. The less obvious piece is that context managers are also equivalent to generators – most importantly, contextlib.contextmanager can translate generators to context managers.

To consistently handle all kinds of concurrency, we need async variants for routines (await), iterables (async for) and context managers (async with). Only a blanket async with and async for can cover every step consistently.

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
  • I'm still sort of puzzled about the use of `async` for `for` and `withs`. My understanding is that `sync def` creates a coroutine -- a function that can give up execution control to the caller. But in the `async for x in range(10)` -- I don't understand why the async is needed since I've written for loops that call awaits without issues e.g. ` for i in range(num_steps): await asyncio.sleep(1)`. So I don't understand why I need the async for the for loop for. Can you clarify this? – Charlie Parker Jun 22 '22 at 19:09
  • 1
    @CharlieParker In your example `for` loop, only the *body* is `async`. In an `async for`, the *iterable* itself can be `async` – for example, it could fetch data from a remote database, waiting for each item until it arrives. – MisterMiyagi Jun 22 '22 at 19:22
  • hmmm...but my normal for loop I wrote that calls the `asyncio.sleep(1)` is doing the fetching (or simulating it at least with the sleep!) and at the same time allowing the execution to go concurrent (none blocking) by using the `await` keyword. Right? So basically I don't actually every need to use an `async for` if I don't want to. I can always do a normal for loop and I could manually call `await it.next()`. Right? `async for` is more for convinience. Is it? – Charlie Parker Jun 22 '22 at 19:26
  • 1
    @CharlieParker Well, yes – you could manually unroll `async for` just like you can unroll a regular `for` to a `while`. Both are abstractions, not fundamental primitives. Similarly, you could "unroll" both `async with` and `with` using `try:` `except:`. – MisterMiyagi Jun 22 '22 at 19:30
  • but my point is not really about the unrolling for the sake of unrolling. In async programming there is an important notion of concurrency where we can use the `await` statement to give control back to the caller and let something else run. Here I am wondering if what we are awaiting is the next item in the (async) iterator and other details like if we are awaiting only the next one or we are awaiting all future calls to the for loop like we sort of do with `gather`. Like what are we awaiting exactly with a `async for`? – Charlie Parker Jun 22 '22 at 19:33
  • Like hiding the details of a `for` and a `while` is fine because there is nothing weird going on with concurrency. It's mostly just conivnience. But with the `async for` the question is how many things are we awaiting? What are we awaiting? Who gets control? Is it a "markovian" await? It just seems slightly more subtle, perhaps in a more significant way than in asynchronous programming -- beyond just increasing indexes or counters manually. – Charlie Parker Jun 22 '22 at 19:35
  • 1
    @CharlieParker That you assume `for` is about "just increasing indexes or counters" just underlines that it is an important abstraction for hiding details. Even simple nestings of higher-order iterators like `map` or the `itertools` are *extremely* complex in total. An `async` iterator can be *simpler* logically, because `async for`+event-loops switch deterministically whereas `for`+threads can arbitrarily interleave. – MisterMiyagi Jun 22 '22 at 19:46
  • beyond my simplistic example I gave -- what I was truly trying to understand is the concurrency mechanism of `async for` and how it calls `await`, on which objects, who gets control of execution etc. That is what I'm mostly trying to understand. Is the cartoon example (for sake of pedagogy) of how `async for` works something like this, at the beginning of each loop we do `await iter.next()` and we do it on a single and each loop iteration only -- thus only continuing to the next loop once the current `await` has returned? Is this the correct way to understand the async part of this? – Charlie Parker Jun 22 '22 at 20:08
  • 1
    @CharlieParker Yes, that is basically correct. If you think of `for x in y:` as a `while` repeatedly running `x = y.__next__()`, you can similarly think of `async for x in y:` as a `while` repeatedly running `x = await y.__anext__()`. This allows to suspend "inside" the `async for` waiting for the async iterator to produce the next item. – MisterMiyagi Jun 28 '22 at 13:07
  • 1
    but that seems bad to me. Wouldn't that meen that `async for`'s block? wouldn't it be better to run in the loop a bunch of tasks and then await them later in a gather outside the loop? What is the use for `async for`'s if they block with the `await` keyword? They only provide a small benefit that perhaps we can run something else during the await but beyond that it basically blocks. I feel I am missing something. – Charlie Parker Jun 28 '22 at 17:44
  • 1
    @CharlieParker Allowing the loop to (asynchronously! == letting others run) block while waiting for an item is the entire point of `async for`. This potentially isn't "small" but seconds, minutes, hours or even outright blocking *indefinitely* on results from another task. That is something fundamentally different from grouping multiple tasks together, ala `gather`. If that isn't sufficient information for you, I am not sure whether continuing a trail of comments is appropriate to clear up whatever piece of information is actually missing for you. – MisterMiyagi Jun 28 '22 at 18:41
  • What's confusing is that this is an `async` that doesn't require `await` to get the result from, and is, in fact, a blocking instruction (At first glance I think it may be slightly less confusing to have `await for`, instead? as the thing in the for is async already, and really is being awaited?) – njzk2 May 13 '23 at 20:10
20

async for and async with are logical continuation of the development from lower to higher levels.

In the past, the for loop in a programming language used to be capable only of simple iterating over an array of values linearly indexed 0, 1, 2 ... max.

Python's for loop is a higher-level construct. It can iterate over anything supporting the iteration protocol, e.g. set elements or nodes in a tree - none of them has items numbered 0, 1, 2, ... etc.

The core of the iteration protocol is the __next__ special method. Each successive call returns the next item (which may be a computed value or retrieved data) or signals the end of iteration.

The async for is the asynchronous counterpart, instead of calling the regular __next__ it awaits the asynchronous __anext__ and everything else remains the same. That allows to use common idioms in async programs:

# 1. print lines of text stored in a file
for line in regular_file:
    print(line)

# 2A. print lines of text as they arrive over the network,
#
# The same idiom as above, but the asynchronous character makes
# it possible to execute other tasks while waiting for new data
async for line in tcp_stream:
    print(line)

# 2B: the same with a spawned command
async for line in running_subprocess.stdout:
    print(line)

The situation with async with is similar. To summarize: the try .. finally construct was replaced by more convenient with block - now considered idiomatic - that can communicate with anything supporting the context manager protocol with its __enter__ and __exit__ methods for entering and exiting the block. Naturally, everything formerly used in a try .. finally was rewritten to become a context manager (locks, pairs of open-close calls, etc)

async with is again a counterpart with asynchronous __aenter__ and __aexit__ special methods. Other tasks may run while the asynchronous code for entering or exiting a with block waits for new data or a lock or some other condition to become fulfilled.

Note: unlike for, it was possible to use asynchronous objects with the plain (not async) with statement: with await lock:, it is deprecated or unsupported now (note that it was not an exact equivalent of async with).

VPfB
  • 14,927
  • 6
  • 41
  • 75
  • Heads up that ``with await lock:`` could still be used, but it's something else than ``async with lock:``. It means the object *producing* a context manager is ``async``, not that the context manager itself is ``async``. – MisterMiyagi Sep 03 '21 at 09:04
  • basically it sounds like for `async for` that the syntax is just to make sure the for loop works properly with async code since its non-trivial to implement (it's my guess). So the for loop works just as normal but now allows the `await` key word to be used. Is this more or less right? – Charlie Parker Sep 03 '21 at 12:38
  • @CharlieParker I would say the implementation of `async for` loop with is at the same difficulty level as the plain `for`. The difference is that it loops over iterables that are working asynchronously in their internal implementation. In other words, you have to use the proper `for` to match the type of iterable. There are very few async iterables compared to regular iterables which are literally everywhere. That makes occurences of `async for` in code quite rare. – VPfB Sep 03 '21 at 14:04
  • @VPfB thanks for the response! I think I understand why a `async with` would exist (e.g. a session client that talks to url websites where the networking is the io - so it makes sense to me that in the implementation details python simply cannot hide it). However, for the `async for` I don't understand what an asynchronous iterable really means or what it would be used for. Would the iterable be doing an async call to an io every time it gets an element? Would the `async for` loop be overlapping the io calls... – Charlie Parker Sep 03 '21 at 15:49
  • ...(eg similar to a `asyncio.gather` where it sort of feels for the user the io calls are called at the same time) I guess am confused what to expect in the `async for`. My current mental model for all this is that `await` either gives control to a new coroutine (the developer wrote) or back to the event loop (that schedules the next "free" coroutine/not waiting) and that is where the "magic" is (so io's are effectively overlapped). But with a `async for` I am not sure. I am essentially curious if it's nothing special like with the `async with` or its a manual way to implement `asyncio.gather` – Charlie Parker Sep 03 '21 at 15:51
  • @CharlieParker Let me try to explain the `async for line in tcp_stream:` line. A tcp stream is a stream of bytes transported over the network in packets of different size. The reader is collecting the packets in a buffer and only when a newline finally arrives, it has a complete text line to be returned. From the `async for` perspective, it awaits stream's `__anext__` and as you wrote, that gives the control to the event loop to run coroutines that are ready to run until the whole line is received in the buffer. Then the `__anext__` is also ready to return the line to the `async for` loop. – VPfB Sep 03 '21 at 18:14
  • 1
    @VPfB thanks for your message! Let me re-iterate just to make sure I understood. So the `async for` is essential a generator getting things from io in an async manner so each time something is ready (crucially) **in the right order** then it returns the next thing. Is that right? So `async for` does not only allow the keyword `await` to be used inside it's body but also to have the iterator get the next item in an asynchronous manner and respecting the order of the iterator. Is that right? – Charlie Parker Sep 06 '21 at 16:00
  • In summary (correct me if I am wrong): 1) iterator is async for the expensive ios for its anext method 2) the iterator respect the programmers expected order of how the elements would be returned (and not the order of how the async io's might get completed). Is this a correct understanding of `async for`? – Charlie Parker Sep 06 '21 at 16:01
  • 1
    @CharlieParker: Re 1): Yes, that is exactly the main reason for async iteration, just a tiny note: a better term than "expensive" would be "I/O Bound" (https://en.wikipedia.org/wiki/I/O_bound) Re 2): Well, it could be the case as in those examples where the iterator assembles entire lines (or other data units), but in general it is not a main characteristics of async interation. A plain iterator reading lines from a disk file does almost the same; the difference is that local file I/O is non-blocking and usually pretty fast, we can consider the result immediately available. – VPfB Sep 06 '21 at 18:27
  • I'm still sort of puzzled about the use of `async` for `for` and `with`s. My understanding is that `sync def` creates a coroutine -- a function that can give up execution control to the caller. But in the `async for x in range(10)` -- I don't understand why the async is needed since I've written for loops that call awaits without issues e.g. ` for i in range(num_steps): await asyncio.sleep(1)`. So I don't understand why I need the async for the for loop for. Can you clarify this? @VPfB – Charlie Parker Jun 22 '22 at 19:10
  • @CharlieParker Let me try to answer. A regular `for` loop gets the values for each iteration from a regular iterator. Every time it needs the next value for the next iteration it calls the iterator's `next` function and the iterator will read the required value from a data structure, or from a file or it will compute it like the `range` does. Now imagine that this value is not immediately available, e.g. a network request/reply is to be done. In order not to block the whole process waiting for the next value, it is obtained asynchronously, so another task can run in the meantime (part 1/2) – VPfB Jun 25 '22 at 13:19
  • @CharlieParker (part 2/2) A loop using an async iterator is the `async for`. Regular sync iterators and async iterators are two different types and not mutually compatible. That's why `async for x in range(10) won't run at all (`TypeError`). A final remark is that `async for` is quite uncommon IMO. You may have no use-case for it in your async program. – VPfB Jun 25 '22 at 13:45
3

My understanding of async with is that it allows python to call the await keyword inside the context manager without python freaking out. Removing the async from the with results in errors. This is useful because the object created is most likely going to do expensive io operations we have to wait for - so we will likely await methods from the object created from this special asynced context manager. Without this closing and opening the context manager correctly likely creates issues within python (otherwise why bother users of python with even more nuanced syntax and semantics to learn?).

I have not fully tested what async for does or the intricacies of it but would love to see an example and might later test it once I need it and update this answer. I will put the example here once I get to it: https://github.com/brando90/ultimate-utils/blob/master/tutorials_for_myself/concurrency/asyncio_for.py

For now see my annotated example with async with (script lives https://github.com/brando90/ultimate-utils/blob/master/tutorials_for_myself/concurrency/asyncio_my_example.py):

"""
1. https://realpython.com/async-io-python/#the-asyncawait-syntax-and-native-coroutines
2. https://realpython.com/python-concurrency/
3. https://stackoverflow.com/questions/67092070/why-do-we-need-async-for-and-async-with

todo - async with, async for.

todo: meaning of:
    - The async for and async with statements are only needed to the extent that using plain for or with would “break”
        the nature of await in the coroutine. This distinction between asynchronicity and concurrency is a key one to grasp
    - One exception to this that you’ll see in the next code is the async with statement, which creates a context
        manager from an object you would normally await. While the semantics are a little different, the idea is the
        same: to flag this context manager as something that can get swapped out.
    - download_site() at the top is almost identical to the threading version with the exception of the async keyword on
        the function definition line and the async with keywords when you actually call session.get().
        You’ll see later why Session can be passed in here rather than using thread-local storage.
    - An asynchronous context manager is a context manager that is able to suspend execution in its enter and exit
        methods.
"""

import asyncio
from asyncio import Task

import time

import aiohttp
from aiohttp.client_reqrep import ClientResponse

from typing import Coroutine


async def download_site(coroutine_name: str, session: aiohttp.ClientSession, url: str) -> ClientResponse:
    """
    Calls an expensive io (get data from a url) using the special session (awaitable) object. Note that not all objects
    are awaitable.
    """
    # - the with statement is bad here in my opion since async with is already mysterious and it's being used twice
    # async with session.get(url) as response:
    #     print("Read {0} from {1}".format(response.content_length, url))
    # - this won't work since it only creates the coroutine. It **has** to be awaited. The trick to have it be (buggy)
    # synchronous is to have the main coroutine call each task we want in order instead of giving all the tasks we want
    # at once to the vent loop e.g. with the asyncio.gather which gives all coroutines, gets the result in a list and
    # thus doesn't block!
    # response = session.get(url)
    # - right way to do async code is to have this await so someone else can run. Note, if the download_site/ parent
    # program is awaited in a for loop this won't work regardless.
    response = await session.get(url)
    print(f"Read {response.content_length} from {url} using {coroutine_name=}")
    return response

async def download_all_sites_not_actually_async_buggy(sites: list[str]) -> list[ClientResponse]:
    """
    Code to demo the none async code. The code isn't truly asynchronous/concurrent because we are awaiting all the io
    calls (to the network) in the for loop. To avoid this issue, give the list of coroutines to a function that actually
    dispatches the io like asyncio.gather.

    My understanding is that async with allows the object given to be a awaitable object. This means that the object
    created is an object that does io calls so it might block so it's often the case we await it. Recall that when we
    run await f() f is either 1) coroutine that gains control (but might block code!) or 2) io call that takes a long
    time. But because of how python works after the await finishes the program expects the response to "actually be
    there". Thus, doing await blindly doesn't speed up the code. Do awaits on real io calls and call them with things
    that give it to the event loop (e.g. asyncio.gather).

    """
    # - create a awaitable object without having the context manager explode if it gives up execution.
    # - crucially, the session is an aiosession - so it is actually awaitable so we can actually give it to
    # - asyncio.gather and thus in the async code we truly take advantage of the concurrency of asynchronous programming
    async with aiohttp.ClientSession() as session:
    # with aiohttp.ClientSession() as session:  # won't work because there is an await inside this with
        tasks: list[Task] = []
        responses: list[ClientResponse] = []
        for i, url in enumerate(sites):
            task: Task = asyncio.ensure_future(download_site(f'coroutine{i}', session, url))
            tasks.append(task)
            response: ClientResponse = await session.get(url)
            responses.append(response)
        return responses


async def download_all_sites_truly_async(sites: list[str]) -> list[ClientResponse]:
    """
    Truly async program that calls creates a bunch of coroutines that download data from urls and the uses gather to
    have the event loop run it asynchronously (and thus efficiently). Note there is only one process though.
    """
    # - indicates that session is an async obj that will likely be awaited since it likely does an expensive io that
    # - waits so it wants to give control back to the event loop or other coroutines so they can do stuff while the
    # - io happens
    async with aiohttp.ClientSession() as session:
        tasks: list[Task] = []
        for i, url in enumerate(sites):
            task: Task = asyncio.ensure_future(download_site(f'coroutine{i}', session, url))
            tasks.append(task)
        responses: list[ClientResponse] = await asyncio.gather(*tasks, return_exceptions=True)
        return responses


if __name__ == "__main__":
    # - args
    sites = ["https://www.jython.org", "http://olympus.realpython.org/dice"] * 80
    start_time = time.time()

    # - run main async code
    # main_coroutine: Coroutine = download_all_sites_truly_async(sites)
    main_coroutine: Coroutine = download_all_sites_not_actually_async_buggy(sites)
    responses: list[ClientResponse] = asyncio.run(main_coroutine)

    # - print stats
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} sites in {duration} seconds")
    print('Success, done!\a')
Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
  • I'm still sort of puzzled about the use of `async` for `for` and `withs`. My understanding is that `sync def` creates a coroutine -- a function that can give up execution control to the caller. But in the `async for x in range(10)` -- I don't understand why the async is needed since I've written for loops that call awaits without issues e.g. ` for i in range(num_steps): await asyncio.sleep(1)`. So I don't understand why I need the async for the for loop for. Can you clarify this? – Charlie Parker Jun 22 '22 at 19:10