57

I'm getting the flow of using asyncio in Python 3.5 but I haven't seen a description of what things I should be awaiting and things I should not be or where it would be neglible. Do I just have to use my best judgement in terms of "this is an IO operation and thus should be awaited"?

dalanmiller
  • 3,467
  • 5
  • 31
  • 38
  • 8
    Read [PEP 492](https://www.python.org/dev/peps/pep-0492/#id50) for details, but generally speaking you should `await` on all Futures, `@coroutine` decorated functions & `async def` functions. – Jashandeep Sohi Oct 27 '15 at 00:10

2 Answers2

128

By default all your code is synchronous. You can make it asynchronous defining functions with async def and "calling" these functions with await. A More correct question would be "When should I write asynchronous code instead of synchronous?". Answer is "When you can benefit from it". In cases when you work with I/O operations as you noted you will usually benefit:

# Synchronous way:
download(url1)  # takes 5 sec.
download(url2)  # takes 5 sec.
# Total time: 10 sec.

# Asynchronous way:
await asyncio.gather(
    async_download(url1),  # takes 5 sec. 
    async_download(url2)   # takes 5 sec.
)
# Total time: only 5 sec. (+ little overhead for using asyncio)

Of course, if you created a function that uses asynchronous code, this function should be asynchronous too (should be defined as async def). But any asynchronous function can freely use synchronous code. It makes no sense to cast synchronous code to asynchronous without some reason:

# extract_links(url) should be async because it uses async func async_download() inside
async def extract_links(url):  

    # async_download() was created async to get benefit of I/O
    html = await async_download(url)  

    # parse() doesn't work with I/O, there's no sense to make it async
    links = parse(html)  

    return links

One very important thing is that any long synchronous operation (> 50 ms, for example, it's hard to say exactly) will freeze all your asynchronous operations for that time:

async def extract_links(url):
    data = await download(url)
    links = parse(data)
    # if search_in_very_big_file() takes much time to process,
    # all your running async funcs (somewhere else in code) will be frozen
    # you need to avoid this situation
    links_found = search_in_very_big_file(links)

You can avoid it calling long running synchronous functions in separate process (and awaiting for result):

executor = ProcessPoolExecutor(2)

async def extract_links(url):
    data = await download(url)
    links = parse(data)
    # Now your main process can handle another async functions while separate process running    
    links_found = await loop.run_in_executor(executor, search_in_very_big_file, links)

One more example: when you need to use requests in asyncio. requests.get is just synchronous long running function, which you shouldn't call inside async code (again, to avoid freezing). But it's running long because of I/O, not because of long calculations. In that case, you can use ThreadPoolExecutor instead of ProcessPoolExecutor to avoid some multiprocessing overhead:

executor = ThreadPoolExecutor(2)

async def download(url):
    response = await loop.run_in_executor(executor, requests.get, url)
    return response.text
Mikhail Gerasimov
  • 36,989
  • 16
  • 116
  • 159
  • 1
    Hi, Mikhail. What do you mean by `freezing` here? If the function `search_in_very_big_file` needs the output from `links = parse(data)`, the asynchronous download doesn't reduce the total execution time. So you call this `freeze`? Thank you. – Alston Jan 16 '19 at 14:05
  • 6
    @Stallman by `freezing` in async programming I mean any non-async function that takes much time (> 50 ms) to be executed. `requests.get(url)`, `time.sleep(1)` - are examples of such functions. When such functions being executed from main thread, event loop can't continue executing coroutines anywhere else. Therefore in example from answer awaiting of first version of `extract_links` will block execution of coroutines in other parts of code. To avoid it second version of `extract_links` runs freezing function in background thread. – Mikhail Gerasimov Jan 16 '19 at 14:19
  • What is about this `> 50 ms`? On what does it depend also? – buhtz Mar 12 '19 at 23:30
  • 3
    @buhtz it's just arbitrary time I use as example. It's impossible to calculate concrete number, but main idea is that if you block event loop for this or more time it can be potentially harmful towards other coroutines successfulness. Imagine situation when you start async request with timeout of 10 seconds and immediately block event loop for 11 seconds: request will timeout, while could have succeed. To prevent the situation you should make sure you return control to event loop not less frequently than some small time, like, for example, 50 ms. – Mikhail Gerasimov Mar 13 '19 at 13:45
  • 2
    Or you can use [aiohttp](https://pypi.org/project/aiohttp/), which is a request package designed to be used with asyncio. – Lord Elrond Mar 16 '19 at 21:30
  • 1
    How can you call `loop.run_in_executor()`? Where do you get the `loop` variable from? – PlsWork Jun 02 '19 at 16:00
  • 6
    @AnnaVopureta `loop = asyncio.get_event_loop()` (doc is [here](https://docs.python.org/3/library/asyncio-eventloop.html#event-loop)) – Mikhail Gerasimov Jun 02 '19 at 16:15
  • @MikhailGerasimov I can call multiple threads in the same event loop at different places ryt? – y_159 Sep 12 '21 at 16:17
  • @y_159 yes, you can use `run_in_executor` (last code snippet) to run something in thread without blocking event loop. – Mikhail Gerasimov Sep 13 '21 at 16:00
  • In Python 3.9+, use `to_thread` to avoid freezing: `response = await asyncio.to_thread(requests.get, url)` – Attila the Fun Aug 02 '22 at 16:03
2

You do not have much freedom. If you need to call a function you need to find out if this is a usual function or a coroutine. You must use the await keyword if and only if the function you are calling is a coroutine.

If async functions are involved there should be an "event loop" which orchestrates these async functions. Strictly speaking it's not necessary, you can "manually" run the async method sending values to it, but probably you don't want to do it. The event loop keeps track of not-yet-finished coroutines and chooses the next one to continue running. asyncio module provides an implementation of event loop, but this is not the only possible implementation.

Consider these two lines of code:

x = get_x()
do_something_else()

and

x = await aget_x()
do_something_else()

Semantic is absolutely the same: call a method which produces some value, when the value is ready assign it to variable x and do something else. In both cases the do_something_else function will be called only after the previous line of code is finished. It doesn't even mean that before or after or during the execution of asynchronous aget_x method the control will be yielded to event loop.

Still there are some differences:

  • the second snippet can appear only inside another async function
  • aget_x function is not usual, but coroutine (that is either declared with async keyword or decorated as coroutine)
  • aget_x is able to "communicate" with the event loop: that is yield some objects to it. The event loop should be able to interpret these objects as requests to do some operations (f.e. to send a network request and wait for response, or just suspend this coroutine for n seconds). Usual get_x function is not able to communicate with event loop.
lesnik
  • 2,507
  • 2
  • 25
  • 24