I'm writing a program that downloads data from a website (eve-central.com). It returns xml when I send a GET request with some parameters. The problem is that I need to make about 7080 of such requests because i can't specify the typeid
parameter more than once.
def get_data_eve_central(typeids, system, hours, minq=1, thread_count=1):
import xmltodict, urllib3
pool = urllib3.HTTPConnectionPool('api.eve-central.com')
for typeid in typeids:
r = pool.request('GET', '/api/quicklook', fields={'typeid': typeid, 'usesystem': system, 'sethours': hours, 'setminQ': minq})
answer = xmltodict.parse(r.data)
It was really slow when I just connected to the website and made all the requests so I decided to make it use multiple threads at a time (I read that if the process involves a lot of waiting (I/O, HTTP requests), it can be speeded up a lot with multithreading). I rewrote it using multiple threads, but it somehow isn't any faster (a bit slower in fact). Here's the code rewritten using multithreading:
def get_data_eve_central(all_typeids, system, hours, minq=1, thread_count=1):
if thread_count > len(all_typeids): raise NameError('TooManyThreads')
def requester(typeids):
pool = urllib3.HTTPConnectionPool('api.eve-central.com')
for typeid in typeids:
r = pool.request('GET', '/api/quicklook', fields={'typeid': typeid, 'usesystem': system, 'sethours': hours, 'setminQ': minq})
answer = xmltodict.parse(r.data)['evec_api']['quicklook']
answers.append(answer)
def chunkify(items, quantity):
chunk_len = len(items) // quantity
rest_count = len(items) % quantity
chunks = []
for i in range(quantity):
chunk = items[:chunk_len]
items = items[chunk_len:]
if rest_count and items:
chunk.append(items.pop(0))
rest_count -= 1
chunks.append(chunk)
return chunks
t = time.clock()
threads = []
answers = []
for typeids in chunkify(all_typeids, thread_count):
threads.append(threading.Thread(target=requester, args=[typeids]))
threads[-1].start()
threads[-1].join()
print(time.clock()-t)
return answers
What I do is I divide all typeids
into as many chunks as the quantity of threads i want to use and create a thread for each chunk to process it. The question is: what can slow it down? (I apologise for my bad english)