23

I want to connect to a list of a lot of different sites very fast. Im using asyncio to do this in an asynchronous manner and now want to add a timeout for when connections should be ignored if they take too long to respond.

How do I implement this?

import ssl
import asyncio
from contextlib import suppress
from concurrent.futures import ThreadPoolExecutor
import time


@asyncio.coroutine
def run():
    while True:
        host = yield from q.get()
        if not host:
            break

        with suppress(ssl.CertificateError):
            reader, writer = yield from asyncio.open_connection(host[1], 443, ssl=True) #timout option?
            reader.close()
            writer.close()


@asyncio.coroutine
def load_q():
    # only 3 entries for debugging reasons
    for host in [[1, 'python.org'], [2, 'qq.com'], [3, 'google.com']]:
        yield from q.put(host)
    for _ in range(NUM):
        q.put(None)


if __name__ == "__main__":
    NUM = 1000
    q = asyncio.Queue()

    loop = asyncio.get_event_loop()
    loop.set_default_executor(ThreadPoolExecutor(NUM))

    start = time.time()
    coros = [asyncio.async(run()) for i in range(NUM)]
    loop.run_until_complete(load_q())
    loop.run_until_complete(asyncio.wait(coros))
    end = time.time()
    print(end-start)

(On a sidenote: Has somebody an idea how to optimize this?)

dano
  • 91,354
  • 19
  • 222
  • 219
scpio
  • 233
  • 1
  • 2
  • 4
  • You forgot to `yield from` the calls to `q.put(None)` inside `load_q`, so this code won't work as currently written. – dano Apr 20 '15 at 19:38
  • you don't need reader,writer here. You could use `asyncio.create_connection` with `Protocol` that does nothing (it closes the network connection as soon as it is established). Here's [code example that I've tried on top million Alexa site list](http://stackoverflow.com/a/20722204/4279) (it might be slightly outdated e.g., it doesn't use some convience functions such as `asyncio.wait_for()`). It uses a single thread and opens upto `limit` ssl connections. – jfs Apr 20 '15 at 22:55

1 Answers1

28

You can wrap the call to open_connection in asyncio.wait_for, which allows you to specify a timeout:

    with suppress(ssl.CertificateError):
        fut = asyncio.open_connection(host[1], 443, ssl=True)
        try:
            # Wait for 3 seconds, then raise TimeoutError
            reader, writer = yield from asyncio.wait_for(fut, timeout=3)
        except asyncio.TimeoutError:
            print("Timeout, skipping {}".format(host[1]))
            continue

Note that when TimeoutError is raised, the open_connection coroutine is also cancelled. If you don't want it to be cancelled (though I think you do want it to be cancelled in this case), you have wrap the call in asyncio.shield.

dano
  • 91,354
  • 19
  • 222
  • 219
  • but this will also make it a blocking call no? Like opening connections in normal loop one after other. – Ali Faizan Nov 22 '17 at 10:48
  • 1
    @ali No, because all the calls to the `run` method are wrapped in an `asyncio.async` call, which means they all run concurrently. – dano Nov 25 '17 at 02:41
  • 1
    If the connection timeout needs to be inside another coroutine, see [https://stackoverflow.com/questions/28609534/python-asyncio-force-timeout/48546189#48546189](Python asyncio force timeout) about stacking `asyncio.ensure_future(asyncio.wait_for(create_connection()))` – Jari Turkia Jan 31 '18 at 15:55
  • I'm pretty sure this stopped working with 3.7 because of this change mentioned in the wait_for doc- `Changed in version 3.7: When aw is cancelled due to a timeout, wait_for waits for aw to be cancelled. Previously, it raised asyncio.TimeoutError immediately.` – WirthLuce Jun 23 '19 at 16:17