I have an application that makes 100 HTTP requests every second using a single IOLoop. The requests timeout after 10 seconds. I am not doing any processing with the response of the request at the moment.
What I have noticed is that the memory footprint of the program gradually grows to 1GB of RAM until the OS kills it, leading me to think that Python or Tornado is not managing memory optimally.
What I would like to try next is have multiple IOLoops (ten running concurrently) - in the hope that when I stop and close an IOLOOP it will hopefully free up some memory and my application can continue running.
Few questions:
- Would this approach help free up program memory?
- Why is the memory footprint gradually growing?
- How do I start up multiple IOLoops and then shut them back down?
Any help would be appreciated - I have tried using processes and threads to manage the memory but nothing has worked well thus far.
If it helps here is my current code:
import datetime
from tornado.httpclient import AsyncHTTPClient
import tornado.ioloop
PROXIES = []
def load_proxies():
"""Read proxies from file and store them in PROXIES"""
...
def test_proxies():
"""Test proxies"""
global PROXIES
print '\nProxy Count: ' + str(len(PROXIES)) + '\n'
for proxy in PROXIES:
request = tornado.httpclient.HTTPRequest("http://target.com", request_timeout=5)
request.proxy_host = proxy['host']
request.proxy_port = proxy['port']
HTTP_CLIENT.fetch(request, handle_response)
tornado.ioloop.IOLoop.current().add_timeout(datetime.timedelta(seconds=1), test_proxies)
def handle_response(response):
"""Handles response"""
try:
proxy = {d['host']:d for d in PROXIES}[response.request.proxy_host]
except KeyError:
return
if response.code != 200:
PROXIES.remove(proxy)
print response.code
AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient", max_clients=10000)
HTTP_CLIENT = AsyncHTTPClient()
load_proxies()
test_proxies()
tornado.ioloop.IOLoop.current().start()