I am trying to get data for multiple clients(details on a csv file) using API calls to an internal system API. Currently, it shows me that for an average over 1000 API calls the response time when running 15 threads is 0.6691 secs. This over the course of half a million requests will add up to about 93 hours. My code without too many details of the API looks like so:
def get_file():
count = -1
freader = open(f_name, 'rU')
csvreader = csv.reader(freader)
for row in csvreader:
userid = str(row[0])
count += 1
while (activeCount() > 15):
time.sleep(10)
continue
thread = Thread(target=check, args=(userid, count,))
thread.start()
# check(userid)
thread.join()
def check(userid, count):
headers = {
'Accept': '*/*',
'Authorization': authStr,
'accept-encoding': 'gzip, deflate'
}
url = "{}{}/{}/{}".format(api_url, site_id, id_type, userid)
response = requests.get(url, headers=headers)
if response.status_code == 404:
viewed_count = 0
else:
viewed_count = json_extract(response.json())
How can I speed this up? What is the maximum number of threads(activeCount)
I can specify? And is there an easier, faster and more elegant way to do this?