I am using mpi4py for a project I want to parallelize. Below is very basic pseudo code for my program:
Load list of data from sqlite database
Based on COMM.Rank and Comm.Size, select chunk of data to process
Process data...
use MPI.Gather to pass all of the results back to root
if root:
iterate through results and save to sqlite database
I would like to eliminate the call to MPI.Gather by simply having each process write its own results to the database. So I want my pseudo code to look like this:
Load list of data
Select chunk of data
Process data
Save results
This would drastically improve my program's performance. However, I am not entirely sure how to accomplish this. I have tried to find methods through google, but the only thing I could find is MPI-IO. Is it possible to use MPI-IO to write to a database? Specifically using python, sqlite, and mpi4py. If not, are there any alternatives for writing concurrently to a sqlite database?
EDIT:
As @CL pointed out in a comment, sqlite3 does not support concurrent writes to the database. So let me ask my question a little differently: Is there a way to lock writes to the database so that other processes wait till the lock is removed before writing? I know sqlite3 has its own locking modes, but these modes seem to cause insertions to fail rather than block. I know I've seen something like this in Python threading, but I haven't been able to find anything online about doing this with MPI.