When to use async vs threads in python?

Question

Below are 2 examples that run sleep for 1 and 2 seconds each using async / await syntax and using threads. The results are seemingly the same but I think they fundamentally work differently. In which case, when to use async vs a thread?

import asyncio
import threading
import time
from datetime import timedelta


async def say_after(delay, what):
    await asyncio.sleep(delay)
    print(what)


async def async_tasks():
    start_time = time.perf_counter()
    task1 = asyncio.create_task(say_after(1, 'hello'))
    task2 = asyncio.create_task(say_after(2, 'world'))
    await task1
    await task2
    print(f'async time: {timedelta(seconds=time.perf_counter() - start_time)}')


def say_after2(delay, what):
    time.sleep(delay)
    print(what)


def threads():
    start_time = time.perf_counter()
    task1 = threading.Thread(target=say_after2, args=(1, 'hello'))
    task2 = threading.Thread(target=say_after2, args=(2, 'world'))
    task1.start()
    task2.start()
    task1.join()
    task2.join()
    print(f'threads time: {timedelta(seconds=time.perf_counter() - start_time)}')


if __name__ == '__main__':
    asyncio.run(async_tasks())
    threads()

hello
world
async time: 0:00:02.002056
hello
world
threads time: 0:00:02.004553

jwal · Answer 1 · 2022-09-21T17:23:23.853

Interesting to scale this up a bit. I bumped from 2 to 1000. The results were as below. Threads create bigger memory assignments (expected) so you can expect this to hit a limit faster than with asyncio. Both are limited by GIL and are not multi-processing.

size, peak: (0, 0)
async time: 0:00:00.134895
size, peak: (293112, 2303252)
threads time: 0:00:00.252217
size, peak: (497151, 4764784)

In asyncio gather is a nice way to do this.

async def async_tasks():
    start_time = time.perf_counter()
    tasks = []
    for _i in range(1000):
        tasks.append(asyncio.create_task(say_after(0.1, 'hello')))
    await asyncio.gather(*tasks)
    print(f'async time: {timedelta(seconds=time.perf_counter() - start_time)}')

The may be a nicer way with threading than what I used, however simple!

def threads():
    start_time = time.perf_counter()
    tasks = []
    for _i in range(1000):
        tasks.append(threading.Thread(target=say_after2, args=(0.1, 'hello')))
    for task in tasks:
        task.start()
    for task in tasks:
        task.join()
    print(f'threads time: {timedelta(seconds=time.perf_counter() - start_time)}')

From the docs (edited to use get_traced_memory as opposed to statistics which I had wrong)

if __name__ == '__main__':
    tracemalloc.start()
    print("size, peak:", tracemalloc.get_traced_memory())
    tracemalloc.reset_peak()
    asyncio.run(async_tasks())
    print("size, peak:", tracemalloc.get_traced_memory())
    tracemalloc.reset_peak()
    threads()
    print("size, peak:", tracemalloc.get_traced_memory())
    tracemalloc.reset_peak()

When to use async vs threads in python?

1 Answers1