0

I have an application that makes 100 HTTP requests every second using a single IOLoop. The requests timeout after 10 seconds. I am not doing any processing with the response of the request at the moment.

What I have noticed is that the memory footprint of the program gradually grows to 1GB of RAM until the OS kills it, leading me to think that Python or Tornado is not managing memory optimally.

What I would like to try next is have multiple IOLoops (ten running concurrently) - in the hope that when I stop and close an IOLOOP it will hopefully free up some memory and my application can continue running.

Few questions:

  1. Would this approach help free up program memory?
  2. Why is the memory footprint gradually growing?
  3. How do I start up multiple IOLoops and then shut them back down?

Any help would be appreciated - I have tried using processes and threads to manage the memory but nothing has worked well thus far.

If it helps here is my current code:

import datetime
from tornado.httpclient import AsyncHTTPClient
import tornado.ioloop

PROXIES = []

def load_proxies():
    """Read proxies from file and store them in PROXIES"""
    ...

def test_proxies():
    """Test proxies"""
    global PROXIES
    print '\nProxy Count: ' +  str(len(PROXIES)) + '\n'
    for proxy in PROXIES:
        request = tornado.httpclient.HTTPRequest("http://target.com", request_timeout=5)
        request.proxy_host = proxy['host']
        request.proxy_port = proxy['port']
        HTTP_CLIENT.fetch(request, handle_response)
    tornado.ioloop.IOLoop.current().add_timeout(datetime.timedelta(seconds=1), test_proxies)

def handle_response(response):
    """Handles response"""
    try:
        proxy = {d['host']:d for d in PROXIES}[response.request.proxy_host]
    except KeyError:
        return

    if response.code != 200:
        PROXIES.remove(proxy)
    print response.code

AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", max_clients=10000)
HTTP_CLIENT = AsyncHTTPClient()
load_proxies()
test_proxies()
tornado.ioloop.IOLoop.current().start()
etayluz
  • 15,920
  • 23
  • 106
  • 151

1 Answers1

0

I think the problem is with the requests your server handles. Do you close the requests correctly when the timeout occurs? Look here: Right way to “timeout” a Request in Tornado It'll help a little if you post at least a portion of your code.

The solution with multiple IOLoops may help you serve more requests at the same time due to multithreading. However, I don't think it'll stop the high memory usage.

There is a good description of how to use multiple IOLoops: Tornado multiple IOLoop in multithreads

Community
  • 1
  • 1
Mateusz Kleinert
  • 1,316
  • 11
  • 20