I'm trying to query my DB large number of times and to activate some logic over the query's result set.
I'm using Roracle
and dopar
in order to do so (BTW-my first try was with RJDBC
, but I switched to Roracle
because I got Error reading from connection; Now, I no longer get this error, but I have the problem described below).
The problem is that most of the process are dying (become zombies) during the parallel session. I monitor this using top
command over my linux system; the log file which shows me the progress of my parallel loop; and monitoring my DB during the session. When I'm starting the program, I see that the workers are loaded and the program progresses in high pace, but then most of them are died, and the program become slow (or not working at all) with no error message.
Here some example code of what I'm trying to do:
library(doParallel)
library(Roracle)
temp <- function(i) {
# because you can't get access to my DB, it's irrelevant to file the following rows(where I put three dots)- But I checked my DB connection and it works fine.
drv <- ...
host <- ...
port <- ...
sid <- ...
connect.string <- paste(...)
conn_oracle <- dbConnect(drv, username=..., password=..., dbname=connect.string)
myData <- dbGetQuery(conn_oracle, sprintf("SELECT '%s%' FROM dual", i))
print(i)
dbDisconnect(conn_oracle)
}
cl <- makeCluster(10, outfile = "par_log.txt")
registerDoParallel(cl)
output <- foreach(i=1:100000, .inorder=T, .verbose=T, .combine='rbind',
.packages=c('Roracle'),
.export=c('temp'))
%dopar% {temp(i)}
stopCluster(cl)
Any help will be appreciated!