13

As a learning exercise, I'm trying to modify the quickstart example of aiohttp to fetch multiple urls with a single ClientSession (the docs suggest that usually one ClientSession should be created per application).

import aiohttp
import asyncio

async def fetch(session, url):
  async with session.get(url) as response:
    return await response.text()

async def main(url, session):
  print(f"Starting '{url}'")
  html = await fetch(session, url)
  print(f"'{url}' done")

urls = (
  "https://python.org",
  "https://twitter.com",
  "https://tumblr.com",
  "https://example.com",
  "https://github.com",
)

loop = asyncio.get_event_loop()
session = aiohttp.ClientSession()
loop.run_until_complete(asyncio.gather(
  *(loop.create_task(main(url, session)) for url in urls)
))
# session.close()   <- this doesn't make a difference

However, creating the ClientSession outside a coroutine is clearly not the way to go:


➜ python 1_async.py
1_async.py:30: UserWarning: Creating a client session outside of coroutine is a very dangerous idea
  session = aiohttp.ClientSession()
Creating a client session outside of coroutine
client_session: 
Starting 'https://python.org'
Starting 'https://twitter.com'
Starting 'https://tumblr.com'
Starting 'https://example.com'
Starting 'https://github.com'
'https://twitter.com' done
'https://example.com' done
'https://github.com' done
'https://python.org' done
'https://tumblr.com' done
1_async.py:34: RuntimeWarning: coroutine 'ClientSession.close' was never awaited
  session.close()
Unclosed client session
client_session: 
Unclosed connector
connections: ['[(, 15024.110107067)]', '[(, 15024.147785039)]', '[(, 15024.252375415)]', '[(, 15024.292646968)]', '[(, 15024.342368087)]', '[(, 15024.466971983)]', '[(, 15024.602057745)]', '[(, 15024.837045568)]']
connector: 

FWIW, this was main before I attempted the above change:

async def main(url):
  async with aiohttp.ClientSession() as session:
    print(f"Starting '{url}'")
    html = await fetch(session, url)
    print(f"'{url}' done")

What would be the correct way to do this? I thought about passing a list of urls to main but couldn't make it work in a non-sequential fashion.

Tamás Szelei
  • 23,169
  • 18
  • 105
  • 180

1 Answers1

9

Creating a client session outside of coroutine is a very dangerous idea becauses that when you create it, it is bound to current loop. if you change the running loop after, it will hang. But if you use it carefully enough, you can ignore it. Related doc.

As for me, I just only ignore this warning. But it is also easy to overcome it:

async def create_session():
    return aiohttp.ClientSession()

session = asyncio.get_event_loop().run_until_complete(create_session())

Further, you don't need to explicitly create a Task object but just execute this coroutine function:

loop.run_until_complete(asyncio.gather(
  *(main(url, session) for url in urls)
))

Finally, don't forget close is a coroutine. You should use loop.run_until_complete(session.close()) to close session.

BTW, if you want to create a async-like loop, you can refer to my another answer.

Sraw
  • 18,892
  • 11
  • 54
  • 87
  • 1
    *Further, now `Task` object has been deprecated, you should use `Future` object.* What does this mean? `Task` and `Future` do different things, and I have seen no sign of `Task` being deprecated in any way. – user4815162342 Aug 18 '18 at 19:48
  • Well, I don't know why I remember this but after re-reading the doc, I think I'm wrong. Will edit my answer. Thank you. – Sraw Aug 18 '18 at 23:34
  • Thanks, I didn't know about `gather`! Still much to learn in asyncio-world. – Tamás Szelei Aug 19 '18 at 11:35