I am writing a python script to make a 6000 api calls and it takes about 20 minutes to complete in synchronous format. I am trying to see if I can decrease the process time to less than a minute.
Below are the steps I am taking
- Read a csv file which has SKU(unique product identifier)
- Inside a function I use for loop to construct a url with SKU at the end
- Take a response object and passed to another function (response object has product name and price)
Like I said before in synchronous format it takes about 20 minutes or more and I tried using Threads, the process time went down to 5 minutes but the issue with threads is function does not return anything. So to overcome threads issue I used Queue Module but then process time increased.
I am new to python so not sure what I can do to reduce the process time, I looked in asyncio but its very confusing and I am not even sure where to start with asyncio.
So I added below code, not sure if its the best way to handle this. This returns response in 127 seconds for 2750 api calls.
I added 10 functions and I am using 10 threads to call them.
with concurrent.futures.ThreadPoolExecutor() as executor:
first = executor.submit(get_stock_info_first)
second = executor.submit(get_stock_info_second)
third = executor.submit(get_stock_info_third)
fourth = executor.submit(get_stock_info_fourth)
fifth = executor.submit(get_stock_info_fifth)
sixth = executor.submit(get_stock_info_sixth)
seventh = executor.submit(get_stock_info_seventh)
eight = executor.submit(get_stock_info_eight)
t9 = executor.submit(get_stock_info_t9)
t10 = executor.submit(get_stock_info_t10)
return_value1 = first.result()
return_value2 = second.result()
return_value3 = third.result()
return_value4 = fourth.result()
return_value5 = fifth.result()
return_value6 = sixth.result()
return_value7 = seventh.result()
return_value8 = eight.result()
return_value9 = t9.result()
return_value10 = t10.result()
I am trying to see how can I improve this code. Please let me know if there is a better way to do this.