0

I am trying to write my dataframe to Oracle DB using RJDBC connection. I am trying to implement a parallel approach using foreach / parLappy. Here is my code

Sys.setenv(JAVA_HOME='C:/Program Files/Java/jre1.8.0_181') 
library(rJava)
library(RJDBC)
library(DBI)

jdbcDriver =JDBC("oracle.jdbc.OracleDriver",classPath="C:/Program Files/directory/ojdbc6.jar", identifier.quote = "\"") 
jdbcConnection =dbConnect(jdbcDriver, "jdbc:oracle:thin:@//XXXX/YYY", "ZZZ", "TTT")
# connected to DB

After this step I get some data from DB. After processing it I want to write the obtained data frame (brand3.merge.u) to another table in Oracle DB. My code is

library(foreach)
library(doParallel)

#setup clusters
cl<-makeCluster(7)
registerDoParallel(cl)

clusterExport(cl, varlist = list("jdbcConnection", "brand3.merge.u"))

foreach(x = 1:length(brand3.merge.u$CELL_PH_NUM), .packages=c( "rJava", "RJDBC", "DBI", "data.table"), .combine = 'c') %dopar% {
  rJava::.jinit()
  RJDBC::dbSendUpdate(jdbcConnection, "INSERT INTO xxnvdw.an_cust_analytics  VALUES(?,?,?,?,?,?,?,?)", brand3.merge.u[x, 1], brand3.merge.u[x,2], brand3.merge.u[x,3],brand3.merge.u[x,4],brand3.merge.u[x,5],brand3.merge.u[x,6],brand3.merge.u[x,7],brand3.merge.u[x,8])
}

I use rJava::.jinit() to avoid JVM error. But now I am getting

Error in { : 
  task 1 failed - "RcallMethod: attempt to call a method of a NULL object."

error. How can I avoid this error? When I use print function and print my dataframe inside foreach I can get result but dbSendUpdate function yields an error. How can I fix my "do stuff" part of the foreach loop?

NOTE : I have already seen the similar question Error in { : task 3 failed - "RcallMethod: attempt to call a method of a NULL object." but in this question "do stuff" part of the foreach is not given and I already used clusterExport function. So my question is not a duplicated question.

SOLUTION: Thanks to @HenrikB and @F. Privé I solved the problem. To anyone who face with the same problem my solution is:

foreach(x = 1:iters,  .packages=c( "rJava", "RJDBC", "DBI"), .combine = 'c') %dopar% {
  rJava::.jinit()
  jdbcDriver =JDBC("oracle.jdbc.OracleDriver",classPath="C:/Program Files/directory/ojdbc6.jar", identifier.quote = "\"") # IDENTIFIER.QUOTE!!!!!
  jdbcConnection =dbConnect(jdbcDriver, "jdbc:oracle:thin:@//XXXX/YYY", "ZZZ", "TTT")
  RJDBC::dbSendUpdate(jdbcConnection, "INSERT INTO xxnvdw.an_cust_analytics  VALUES(?,?,?,?,?,?,?,?)", brand3.merge.u[x, 1], brand3.merge.u[x,2], brand3.merge.u[x,3],brand3.merge.u[x,4],brand3.merge.u[x,5],brand3.merge.u[x,6],brand3.merge.u[x,7],brand3.merge.u[x,8])
  dbDisconnect(jdbcConnection)
  }
boyaronur
  • 521
  • 6
  • 18
  • 2
    I think you need to open the connection inside the foreach loop. The DB connection might become `NULL` when you pass it to clusters. – F. Privé Jan 15 '19 at 14:55
  • 1
    F. Prive is correct, you can *not* export DBI::DBIConnection object to background R processes. The reason is that they, like for instance regular R connections (but also rJava objects), contain external references (`externalptr`), which are non-exportable. You can see other examples of this in https://cran.r-project.org/web/packages/future/vignettes/future-4-issues.html (I've added an example for DBIConnection for the next release of the future package). – HenrikB Jan 15 '19 at 19:53
  • Thanks a lot it worked, I will edit my question and include your answer in it – boyaronur Jan 16 '19 at 06:40
  • But now I am creating a driver and establishing a connection in each step. Is it a good approach? What do you think in terms of performance? – boyaronur Jan 16 '19 at 06:49

0 Answers0