1

I am receiving this error when using h2o.randomforest. Please see the function call and associated error below.

base_line_rf <- h2o.randomForest(x=2:ncol(train),
                                y=1,
                                ntrees = 10000,
                                mtries = ncol(train)-1,
                                training_frame = train,
                                model_id <- model_id,
                                stopping_rounds = 5,
                                stopping_tolerance = 0,
                                stopping_metric = "AUC",
                                binomial_double_trees = TRUE
)

The error:

java.lang.AssertionError: I am really confused about the heap usage; MEM_MAX=7624720384 heapUsedGC=7626295912
    at water.MemoryManager.set_goals(MemoryManager.java:97)
    at water.MemoryManager.malloc(MemoryManager.java:265)
    at water.MemoryManager.malloc(MemoryManager.java:222)
    at water.MemoryManager.malloc8d(MemoryManager.java:281)
    at hex.tree.DHistogram.init(DHistogram.java:281)
    at hex.tree.DHistogram.init(DHistogram.java:240)
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.computeChunk(ScoreBuildHistogram2.java:326)
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.map(ScoreBuildHistogram2.java:306)
    at water.LocalMR.compute2(LocalMR.java:84)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1255)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.popAndExecAll(ForkJoinPool.java:904)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:977)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

What is the reason for this error?

Thank you

John Smith
  • 51
  • 6
  • Please provide a reproducible example with sample data: [example](https://stackoverflow.com/a/5963610/4421870) – Mako212 Oct 20 '17 at 15:15
  • You probably need more memory, check [this answer](https://stackoverflow.com/questions/45333883/h2o-server-crash). – tobiasegli_te Oct 20 '17 at 15:24
  • This is an assertion error -- assertions are disabled by default, so you must have turned them on (for debugging?). If you turn them off again, it might work, but it's also possible that another related error could pop up later on. – Erin LeDell Oct 20 '17 at 17:01
  • Actually, assertions are enabled by default when you start H2O from R, so you could try to turn it off using `h2o.init()` with `enable_assertions = FALSE`. – Erin LeDell Oct 20 '17 at 18:27

1 Answers1

1

Based on your problem you need to setup H2O cluster to run with more memory to fit your 10000 tree random forest. Looks like the H2O cluster (Java process) is created with 8GB memory however based on your 10000 tree setting it needs more memory then given 8GB.

max_mem_size 7624.720384 MB (Configured)
heapUsedGC - 7626.295912 MB (Required)

Looks like you are using H2O in R so you can pass max_mem_size=12G (means H2O cluster will start with 12GB memory) in your h2o.init() function as below which should fit your random forest requirement:

h2o.init(max_mem_size="12G")

You can also check your H2O cluster details with the command below:

> h2o.clusterInfo()
R is connected to the H2O cluster: 
    H2O cluster uptime:         19 seconds 80 milliseconds 
    H2O cluster version:        3.14.0.3 
    H2O cluster version age:    27 days  
    H2O cluster name:           H2O_started_from_R_avkashchauhan_hwc594 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   10.65 GB <=== This is the max memory size
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         XGBoost, Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.1 (2017-06-30) 
AvkashChauhan
  • 20,495
  • 3
  • 34
  • 65