In my solution I have a list object called output_list
. I am parsing product structure tree data from an API, due to the API calls being I/O time consuming, I am using concurrent.futures
to speed up the process.
output_list = []
input_list = [...] # List of products to fetch data for.
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
result_future = {executor.submit(breakdown,product,output_list,log_file): product for product in input_list}
for future in concurrent.futures.as_completed(result_future ):
try:
dummy = future.result()
except Exception as e:
log_file.write(traceback.format_exc())
raise e
list_to_json_blob(output_list) #Function to transform output_list to a json blob.
def breakdown(product,output_list,log_file):
xml_data = api_function(product) #Function that fetches product structure data, one level down
output_list.extend([product]) #Extend the output list
sub_products = find_subproducts(xml_data) #Return sub products, returns empty list if reached bottom of tree.
for sub_product in sub_products:
breakdown(sub_product,output_list,log_file):
Thus I will have multiple threads extending the same list object in the recursive function. Is there any risk involved in doing so? If so, what would be the best practice to achieve the same purpose?