I am operating on chunks of panel data with about 3*10^6 rows and 11 columns. This data is stored in a SQL database (the data I use is a subset of an even larger dataset). I want to process chunks of the data (each date seperately) without importing the entire thing all at once, but the operations I have to run on each chunk are computationally intensive, so I would like to use the foreach package.
However, I am running into some problems with running the sql call in parallel. If I do a simple call such as (sqlQuery(channel, "select 1")
) I can run that in parallel. If I do a more complex statement I get SIGPIPE errors. Has anyone else run into similar problems?