I want to enable parallel processing/threading of my program using the concurrent.futures module.
Unfortunately I can't seem to find any nice, simple, idiot-proof examples of using the concurrent.futures module. They typically require more advanced knowledge of python or processing/threading concepts and jargon.
The below is a simplified, self-contained example based on my program: there's a purely CPU bound task ideal for multiprcessing, and a separate IO bound task inserting into a database (SQLite). In my program I've already converted this to use the multiprocessing pool class, but because the results from the CPU bound task are all collected up waiting for the tasks to finish, it uses massive amounts of memory. Thus I'm looking to use a combination of threading/processing which I believe concurrent.futures can do for me fairly simply.
So how do I convert the below into something that uses this module?
import sqlite3
#Stand in CPU intensive task
def calculate(value):
return value * 10
#Stand in Thread I/O intensive task
def output(value):
global db
if (value % 1000) == 0:
db.execute('delete from test_table')
db.execute('insert into test_table (result) values (?)', (value,))
def main():
global db
results = []
db = sqlite3.connect('e:\\z_dev\\test.sqlite')
db.cursor()
#=========
#Perform CPU intensive task
for i in range(1000):
results.append( calculate(i))
#Perform Threading intensive task
for a in results:
output(a)
#=========
db.commit()
db.close()
if __name__ == '__main__':
main()
I'm looking for an answer that doesn't use any fancy/complex python. Or a nice clear simple explanation, or ideally both!
Thanks
Edit: My current "multiprocessor" implementation. Probably wrong, but it seems to work. No threading whatsoever. This goes inside the "#=========" part of the above.
#Multiprocessing
pool = multiprocessing.Pool(None)
for i in range(1000):
results.append( pool.apply_async(calculate(i)))
pool.close()
pool.join()
for i in results:
results[i] = results[i].get()
#Complete lack of threading; but if I had it, it'd be here:
for a in results:
output(a)