0

I want to send GET requests to a server at fixed time intervals and log the request and response time. This interval can be in the order of tens of milliseconds. My first approach was to use a thread pool, as described in this answer https://stackoverflow.com/a/2635066/2390362. I would then put a task into the queue whenever the time interval elapses and it's time to make a request. While this worked, it doesn't seem to scale too well.

I came across Tornado in this other answer https://stackoverflow.com/a/25549675/2390362. This seems to perform much better with heavier loads. Here's roughly how I adapted it to do what I described above.

import time
from tornado import ioloop, httpclient
from datetime import datetime, timedelta
from functools import partial

i = 0

def handle_request(req_time, log, response):
    resp_time = datetime.now()
    log.write("%s,\t%s,\t%s,\t%s\n"%(req_time.time(), resp_time.time(), (resp_time - req_time).total_seconds(), response.code))
    global i
    i -= 1
    if i == 0:
        ioloop.IOLoop.instance().stop()


def do_intervals():
    http_client = httpclient.AsyncHTTPClient()
    req_count_limit = 3000
    interval = 0.01
    url = "http://www.someurl.com/"
    global i

    with open("log_file.log", 'a') as log:

        for job_counter in range(req_count_limit):

            i += 1

            req_time = datetime.now()
            current_callback = partial(handle_request, req_time, log)
            http_client.fetch(url.strip(), current_callback, method='GET')
            time.sleep(interval)
        ioloop.IOLoop.instance().start()


if __name__ == '__main__':
    do_intervals()

However, I've noticed that the callback function calls only execute after all the requests have been sent, and not when the response arrives. This makes my measurement of the response time inaccurate. I just discovered Tornado and am not entirely sure how the code up there really works. Is there a way to get the response time that I'm missing, or is this the only way tornado and asynchronous HTTP works?

Community
  • 1
  • 1
Jad S
  • 2,705
  • 6
  • 29
  • 49

1 Answers1

0

In handle_request, the request duration can be reasonably measured with resp_time - req_time.

The problem is that you're blocking the event loop with time.sleep, which means most of the processing doesn't make progress until the initial for loop completes. See Why isn’t this example with time.sleep() running in parallel? Try something like:

@gen.coroutine
def do_intervals():
    # ... existing code ...
    yield gen.sleep(interval)  # instead of time.sleep

Remove IOLoop.instance().start() from do_intervals. Run it like:

IOLoop.instance().run_sync(do_intervals)
A. Jesse Jiryu Davis
  • 23,641
  • 4
  • 57
  • 70
  • Since `yield gen.sleep(interval)` breaks out of the loop, I put the part inside the loop in a seperate `@gen.coroutine` function and used `gen.sleep()`. However it still seems to be calling the handler after all the requests are sent. – Jad S Jun 22 '16 at 14:46
  • Nope, you really need to `yield gen.sleep(interval)` during the loop iteration if you want to sleep in a non-blocking manner while looping. If you decorate your `do_intervals` with `gen.coroutine` and follow the rest of my instructions it'll work as you want. I've updated my answer to be clearer. – A. Jesse Jiryu Davis Jun 22 '16 at 19:02
  • I tried using `ioloop.PeriodicCallback` which worked but your solution seems to scale better (mine would skip some callbacks). PS: I also had to remove the `ioloop.IOLoop.instance().stop()` or else I'd get a `TimeOutError` – Jad S Jun 22 '16 at 23:52