0

I have an asynchronous request() function which mocks calling some API. I create a bunch of tasks that call this function with different parameters and await them. Everything works fine except when await asyncio.sleep() is added to the request() function. Adding this sleep call seems to modify the request() function's params dictionary! There really isn't a way around not awaiting asyncio.sleep() if I am to implement some kind of throttling.

"""Async API wrapper"""
import asyncio
import aiohttp

async def request(params):
    print(f'params before:  {params}')
    await asyncio.sleep(1)

    print(f'params after:  {params}')
    return {'success': True, 'data': []}

async def get_resource(endpoint, start=0, filter_id=0, params={}):
    params['limit'] = 10
    params['start'] = start
    if filter_id:
        params['filter_id'] = filter_id

    r = await request(params)

    if r["success"]:
        return r
    else:
        raise ValueError(f'Request to /{endpoint} with {params} was not successful.')  

async def get_bulk_resource():
    async with aiohttp.ClientSession():
        res = []
        tasks = [asyncio.create_task(get_resource(endpoint='users', start=start)) for start in range(1, 101)]
        for task in asyncio.as_completed(tasks):
            r = await task 
            res += r["data"]

    return res

lst = asyncio.get_event_loop().run_until_complete(get_bulk_resource())
...

Printing the params before and after the sleep call, they are fine right before the sleep call:

params before:  {'limit': 10, 'start': 1}
params before:  {'limit': 10, 'start': 2}
params before:  {'limit': 10, 'start': 3}
...
params before:  {'limit': 10, 'start': 100}

While following asyncio.sleep(), the pagination value gets set to the last page for every task's execution of this coroutine:

params after:  {'limit': 10, 'start': 100}
params after:  {'limit': 10, 'start': 100}
params after:  {'limit': 10, 'start': 100}
...
params after:  {'limit': 10, 'start': 100}

1 Answers1

0

My problem stems from the fact that get_resource() can take the request parameters either as separate arguments or as a dictionary in the params argument.

The problem with my implementation is that the default value for params, set to an empty dict, only creates this empty dict once and reuses it between subsequent calls, as opposed to creating a new one every call.

Explicitly declaring a new params dict to hold my parameters in completely solves this problem:

async def get_resource(endpoint, start=0, filter_id=0, params=None):
    if not params:
        params = {}
    params['limit'] = 10
    params['start'] = start
    if filter_id:
        params['filter_id'] = filter_id

    r = await _request(params)
    ...