I am here basically accessing the api call with various values coming from the list list_of_string_ids
I am expecting to create 20 threads, tell them to do something, write the values to DB and then have them all returning zero and going again to take the next data etc.
I have problem getting this to work using threading. Below is a code which is working correctly as expected, however it is taking very long to finish execration (around 45 minutes or more). The website I am getting the data from allows Async I/O using rate of 20 requests.
I assume this can make my code 20x faster but not really sure how to implement it.
import requests
import json
import time
import threading
import queue
headers = {'Content-Type': 'application/json',
'Authorization': 'Bearer TOKEN'}
start = time.perf_counter()
project_id_number = 123
project_id_string = 'pjiji4533'
name = "Assignment"
list_of_string_ids = [132,123,5345,123,213,213,...,n] # Len of list is 20000
def construct_url_threaded(project_id_number, id_string):
url = f"https://api.test.com/{}/{}".format(project_id_number,id_string)
r = requests.get(url , headers=headers) # Max rate allowed is 20 requests at once.
json_text = r.json()
comments = json.dumps(json_text, indent=2)
for item in json_text['data']:
# DO STUFF
for string_id in all_string_ids_list:
construct_url_threaded(project_id_number=project_id_number, id_string=string_id)
My trial is below
def main():
q = queue.Queue()
threads = [threading.Thread(target=create_url_threaded, args=(project_id_number,string_id, q)) for i in range(5) ] #5 is for testing
for th in threads:
th.daemon = True
th.start()
result1 = q.get()
result2 = q.get()