0

I have the following test:

@pytest.mark.parametrize(
    "raw_id", [28815543, "PMC5890441", "doi:10.1007/978-981-10-5203-3_9" "28815543"]
)
def test_can_fetch_publication(raw_id):
    idr = IdReference.build(raw_id)
    res = asyncio.run(fetch.summary(idr))
    
    assert res == {
        "id": "28815543",
        "source": "MED",
        "pmid": "28815543",
        "doi": "10.1007/978-981-10-5203-3_9",
        "title": "Understanding the Role of lncRNAs in Nervous System Development.",
        <... snip ...>

which runs the function fetch.summary:

@lru_cache()
@retry(requests.HTTPError, tries=5, delay=1)
@throttle(rate_limit=5, period=1.0)
async def summary(id_reference):
    LOGGER.info("Fetching remote summary for %s", id_reference)
    url = id_reference.external_url()
    response = requests.get(url)
    response.raise_for_status()

    data = response.json()
    assert data, "Somehow got no data"
    if data.get("hitCount", 0) == 0:
        raise UnknownReference(id_reference)

    if data["hitCount"] == 1:
        if not data["resultList"]["result"][0]:
            raise UnknownReference(id_reference)

        return data["resultList"]["result"][0]

    <... snip ...>

IdReference is a class defined like this:

@attr.s(frozen=True, hash=True)
class IdReference(object):
    namespace = attr.ib(validator=is_a(KnownServices))
    external_id = attr.ib(validator=is_a(str))
< ... snip ...>

My problem comes when I try to run the test. Given the first and last elements in the parameterisation result in the same IdRefrenece object (the int gets converted to a string - external_id in the class), the coroutine produced by is the same, and I end up with a RuntimeError: cannot reuse already awaited coroutine

Is there a way to get around this? Or am I going to have to split the parameterisation out?

I tried running the test using pytest-asyncio and get the same error, because I think I just fundamentally have to deal with the coroutine being the 'same' somehow

Python version is 3.11, pytest 7.2.0

Theolodus
  • 2,004
  • 1
  • 23
  • 30

1 Answers1

1

this is not async code. Fixing the test is trivial, but you will have to change your code, please read the whole text.

The part responsible to getting you the same (and therefore an "already used") co-routine in this code is the lru_cache.

Just reset the cache in your test body, preventing you from getting used-off co-routines and you should be fine:

...
def test_can_fetch_publication(raw_id):
    idr = IdReference.build(raw_id)
    fetch.summary.cache_clear()
    res = asyncio.run(fetch.summary(idr))
    ...

Rather: this would work, but it is NOT a fault test by accident: the failing test in fact showed you what would happen at run time if this cache is ever hit. The functools.lru_cache() decorator DOES NOT WORK, as you can see, for co-routine functions. Just drop it from your code (instead of the above work-around to clear the cache).

Since it caches what is returned for the function call, what is cached is the actual co-routine, not the co-routine return value.

If this is an important cache to have, write an in-function mechanism for it, instead of a decorator - or rather - your function is not async at all, as it uses the blocking requests.get: just drop the async part in that funciton (as it is, it will be causing more harm than good, by FAR, anyway). Or rewrite it usign httpx, aiohttp or other asyncio counterpart to requests. Equally, there is a chance the throttle decorator would work for an async function - but the retry decorator certainly wont: the actual code is executed as a co-routine just after the co-routine function wrapped by "retry" is executed (co-routine functions return co-routine objects, which are, in their turn, executed when await is called, or as a task): if any exception occurs at this point, any except clause in the retry operator is already past.

jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • Thank you! I fixed the code, but not quite as you suggest, so that it will work outside of the test. That cache is kinda important, so I combined ideas from this answer https://stackoverflow.com/a/46723144/3249000 with threading RLock to add another decorator that allows lru_cache to do its thing. The asynchronicity in my code comes from the throttle decorator, which converts the function into a coroutine. Good point about `retry` I will have a look at it. I may have to restructure things a bit to make that work as expected – Theolodus Nov 16 '22 at 15:36