36

Trying to use a semaphore to control asynchronous requests to control the requests to my target host but I am getting the following error which I have assume means that my asycio.sleep() is not actually sleeping. How can I fix this? I want to add a delay to my requests for each URL targeted.

Error:

RuntimeWarning: coroutine 'sleep' was never awaited
Coroutine created at (most recent call last)
  File "sephora_scraper.py", line 71, in <module>
    loop.run_until_complete(main())
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 571, in run_until_complete
    self.run_forever()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 539, in run_forever
    self._run_once()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 1767, in _run_once
    handle._run()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "makeup.py", line 26, in get_html
    asyncio.sleep(delay)
  asyncio.sleep(delay)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

Code:

import sys
import time
import asyncio
import aiohttp

async def get_html(semaphore, session, url, delay=6):
    await semaphore.acquire()
    async with session.get(url) as res:
        html = await res.text()
        asyncio.sleep(delay)
        semaphore.release()
        return html

async def main():
    categories = {
        "makeup": "https://www.sephora.com/shop/"
    }
    semaphore = asyncio.Semaphore(value=1)
    tasks = []
    async with aiohttp.ClientSession(loop=loop, connector=aiohttp.TCPConnector(ssl=False)) as session:
        for category, url in categories.items():
                # Get HTML of all pages
            tasks.append(get_html(semaphore, session, url))
        res = await asyncio.gather(*tasks)

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
Liondancer
  • 15,721
  • 51
  • 149
  • 255
  • 1
    Also use `async with` or at least `try/except` with semaphore as shown [here](https://docs.python.org/3/library/asyncio-sync.html#asyncio.Semaphore). It'll guarantee semaphore is released even on exception. – Mikhail Gerasimov Jan 08 '19 at 10:44
  • Thanks for asking :) ... I'm not sure why the semaphore? Main can be shortened `asyncio.run(main())` and ClientSession can do without `loop` as it is optional. – Clemens Tolboom May 14 '22 at 14:11

1 Answers1

39
asyncio.sleep(delay)

Change it to:

await asyncio.sleep(delay)

asyncio.sleep is a coroutine and should be awaited.

Mikhail Gerasimov
  • 36,989
  • 16
  • 116
  • 159