R neuralnet package too slow for millions of records

Question

I am trying to train a neural network for churn prediction with R package neuralnet. Here is the code:

data <- read.csv('C:/PredictChurn.csv') 
maxs <- apply(data, 2, max) 
mins <- apply(data, 2, min)
scaled_temp <- as.data.frame(scale(data, center = mins, scale = maxs - mins))
scaled <- data
scaled[, -c(1)] <- scaled_temp[, -c(1)]
index <- sample(1:nrow(data),round(0.75*nrow(data)))
train_ <- scaled[index,]
test_ <- scaled[-index,]
library(neuralnet)
n <- names(train_[, -c(1)])
f <- as.formula(paste("CHURNED_F ~", paste(n[!n %in% "CHURNED_F"], collapse = " + ")))
nn <- neuralnet(f,data=train_,hidden=c(5),linear.output=F)

It works as it should, however when training with the full data set (in the range of millions of rows) it just takes too long. So I know R is by default single threaded, so I have tried researching on how to parallelize the work into all the cores. Is it even possible to make this function in parallel? I have tried various packages with no success.

Has anyone been able to do this? It doesn't have to be the neuralnet package, any solution that lets me train a neural network would work.

Thank you

Is my answer below helpful to you? – Michael Gruenstaeudl Jan 18 '16 at 20:00 — Michael Gruenstaeudl, Jan 18 '16 at 20:00
Try to use BLAS library as the backend, such as Intel MKL. – Patric Feb 01 '17 at 05:50 — Patric, Feb 01 '17 at 05:50

score 1 · Answer 1 · answered Jan 18 '16 at 19:56

1

I have had good experiences with the package Rmpi, and it may be applicable in your case too.

library(Rmpi)

Briefly, its usage is as follows:

nproc = 4  # could be automatically determined
# Specify one master and nproc-1 slaves
Rmpi:: mpi.spawn.Rslaves(nslaves=nproc-1)
# Execute function "func_to_be_parallelized" on multiple CPUs; pass two variables to function
my_fast_results = Rmpi::mpi.parLapply(var1_passed_to_func,
                                      func_to_be_parallelized,
                                      var2_passed_to_func)
# Close slaves
Rmpi::mpi.close.Rslaves(dellog=T)

answered Jan 18 '16 at 19:56

Michael Gruenstaeudl

1,609
1
17
31

Thanks for the tip. But I get an error when using the Rmpi library, saying msmpi.dll is missing. Also, how would you write the command, is this ok: my_fast_results = Rmpi::mpi.parLapply(neuralnet(f,data=train_,hidden=c(5),linear.output=F)) – Fermin Jan 18 '16 at 20:08
@Aulait The error you receive regarding the missing dynamic link library (.dll) is indicative of some missing or outdated dependencies. I would install/update Rmpi with the dependencies flag set to true: `install.packages("Rmpi", dependencies=TRUE)`. – Michael Gruenstaeudl Jan 19 '16 at 09:40
@Aulait The correct syntax of command _mpi.parLapply()_ would probably be: `mpi.parLapply(f, neuralnet, train_, c(5), FALSE)`. Yes, the first argument passed to function _neuralnet()_ is the first argument of _mpi.parLapply()_. – Michael Gruenstaeudl Jan 19 '16 at 09:47
5

I'm not sure this would work. `neuralnets` are trained interactively over data. If I understand correctly, this would make `nproc-1` distinct neural nets. As such, I don't think this raw parallelism will work. That is why GPU implementations are so popular. However no such implementation exists currently in R so I think the best for such large data is [H2O](https://cran.r-project.org/web/packages/h2o/index.html) – cdeterman Jan 19 '16 at 21:10

score 1 · Answer 2 · answered Feb 21 '19 at 18:24

You can try using the caret and doParallel packages for this. This is what I have been using. It works for some of the model types but may not work for all.

  layer1 = c(6,12,18,24,30)
  layer2 = c(6,12,18,24,30)
  layer3 = c(6,12,18,24,30)

  cv.folds = 5

  # In order to make models fully reproducible when using parallel processing, we need to pass seeds as a parameter
  # https://stackoverflow.com/questions/13403427/fully-reproducible-parallel-models-using-caret

  total.param.permutations = length(layer1) * length(layer2) * length(layer3)

  seeds <- vector(mode = "list", length = cv.folds + 1)
  set.seed(1)  
  for(i in 1:cv.folds) seeds[[i]]<- sample.int(n=1, total.param.permutations, replace = TRUE)
  seeds[[cv.folds + 1]]<-sample.int(1, 1, replace = TRUE) #for the last model

  nn.grid <- expand.grid(layer1 = layer1, layer2 = layer2, layer3 = layer3)

  cl <- makeCluster(detectCores()*0.5) # use 50% of cores only, leave rest for other tasks
  registerDoParallel(cl)

  train_control <- caret::trainControl(method = "cv" 
                                       ,number=cv.folds 
                                       ,seeds = seeds # user defined seeds for parallel processing
                                       ,verboseIter = TRUE
                                       ,allowParallel = TRUE
                                       )

  stopCluster(cl)
  registerDoSEQ()

  tic("Total Time to NN Training: ")
  set.seed(1)
  model.nn.caret = caret::train(form = formula,
                       data = scaled.train.data,
                       method = 'neuralnet',
                       tuneGrid = nn.grid,
                       trControl = train_control
                       )
 toc()

R neuralnet package too slow for millions of records

2 Answers2

Linked