I have a database connection to PostgreSQL
using the package RPostgreSQL
. Currently I do the following:
- retrieve a list from my database
- run the list through a for loop, doing a calculation and writing the value back to the database
I am interested in parallelising this process. The obvious is to use the foreach
functionality in the package of the same name. However, we need to use connection pooling: In this case I am interested if anyone knows a parallel backend which I can use to share my database connection. Here is a specific unresolved example:
In the above case there is no connection pooling in the registerDoMC
parallel backend, with the work around to open and close the connection within each dopar
worker. Looking at the registerDoSnow
parallel backend from the snow
package also does not give this functionality.
The alternative would be to use mclapply
instead of dopar
. In this case, does anyone know whether or how to share the database connection with each mclapply
worker?