2

I need to parse repeatedly one link content. synchronous way gives me 2-3 responses per second, i need faster (yes, i know, that too fast is bad too)

I found some async examples, but all of them show how to handle result after all links are parsed, whereas i need to parse it immediately after receiving, something like this, but this code doesn't give any speed improvement:

import aiohttp
import asyncio
import time
async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    while True:
        async with aiohttp.ClientSession() as session:
            html = await fetch(session, 'https://example.com')
            print(time.time())
            #do_something_with_html(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Messa
  • 24,321
  • 6
  • 68
  • 92
vreter
  • 33
  • 4

2 Answers2

1

but this code doesn't give any speed improvement

asyncio (and async/concurrency in general) gives speed improvement for I/O things that interleave each other.

When everything you do is await something and you never create any parallel tasks (using asyncio.create_task(), asyncio.ensure_future() etc.) then you are basically doing the classic synchronous programming :)

So, how to make the requests faster:

import aiohttp
import asyncio
import time

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def check_link(session):
    html = await fetch(session, 'https://example.com')
    print(time.time())
    #do_something_with_html(html)

async def main():
    async with aiohttp.ClientSession() as session:
        while True:
            asyncio.create_task(check_link(session))
            await asyncio.sleep(0.05)

asyncio.run(main())

Notice: the async with aiohttp.Cliensession() as session: must be above (outside) while True: for this to work. Actually, having a single ClientSession() for all your requests is a good practice anyway.

Messa
  • 24,321
  • 6
  • 68
  • 92
0

I gave up using async, threading solved my problem, thanks to this answer https://stackoverflow.com/a/23102874/5678457

from threading import Thread
import requests
import time
class myClassA(Thread):
    def __init__(self):
        Thread.__init__(self)
        self.daemon = True
        self.start()
    def run(self):
        while True:
            r = requests.get('https://ex.com')
            print(r.status_code, time.time())
for i in range(5):
    myClassA()
vreter
  • 33
  • 4