Questions tagged [h2o]

Use this tag for questions about the H2O in-memory machine learning platform. Where relevant, add language tags like [r], [python], [scala], or [java].

Best Practices

Always post a Minimal, Complete and Verifiable Example (MCVE) and provide the H2O version number and client type (Python, R, Flow, etc).

If your question is not code related, do not post to Stack Overflow (per Stack Overflow guidelines). If your question is algorithm related, post to Cross-Validated on Stack Exchange using the "h2o" tag. All other questions can be posted to the h2ostream Google group (please do not double-post).

Resources

1875 questions
22
votes
2 answers

How to get different Variable Importance for each class in a binary h2o GBM in R?

I'm trying to explore the use of a GBM with h2o for a classification issue to replace a logistic regression (GLM). The non-linearity and interactions in my data make me think a GBM is more suitable. I've ran a baseline GBM (see below) and compared…
wake_wake
  • 1,332
  • 2
  • 19
  • 46
21
votes
2 answers

conversion of pandas dataframe to h2o frame efficiently

I have a Pandas dataframe which has Encoding: latin-1 and is delimited by ;. The dataframe is very large almost of size: 350000 x 3800. I wanted to use sklearn initially but my dataframe has missing values (NAN values) so i could not use sklearn's…
ayaan
  • 715
  • 5
  • 18
  • 36
20
votes
3 answers

How to convert r data frame to h2o object

Im new to R and H2O and I have tried to find a way to convert r data frame to a h2o object. I have spent some time research on how to do this with no luck. Other way around is possible and well documented as follows. prosPath =…
plr
  • 511
  • 3
  • 5
  • 15
16
votes
3 answers

How to Setup SPARK_HOME variable?

Following the steps of Sparkling Water from the link http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.2/0/index.html. Running in terminal : ~/InstallFile/SparklingWater/sparkling-water-2.2.0$ bin/sparkling-shell --conf…
roshan_ray
  • 197
  • 1
  • 1
  • 9
14
votes
3 answers

Fastest way to read in 100,000 .dat.gz files

I have a few hundred thousand very small .dat.gz files that I want to read into R in the most efficient way possible. I read in the file and then immediately aggregate and discard the data, so I am not worried about managing memory as I get near the…
Mike.Gahan
  • 4,565
  • 23
  • 39
13
votes
1 answer

Implementation of LIME on h2o modelling in R

I want to implement LIME on a model created using h2o(deep learning) in R. For using the data in the model, I created h2oFrames and converted it back to dataframe before using it in LIME (lime function, because LIME's explain function can't…
gattaca
  • 133
  • 8
12
votes
2 answers

is there a way to convert h2oframe to pandas dataframe

I am able to convert dataframe to h2oframe but how can I convert back to a dataframe? If this is possible not can I convert it to a python list? import pandas as pd import h2o df = pd.DataFrame({'1': [2838, 3222, 4576, 5665, 5998], '2': [1123, 3228,…
JaredDudley04
  • 123
  • 1
  • 1
  • 4
12
votes
2 answers

Predict classes or class probabilities?

I am currently using H2O for a classification problem dataset. I am testing it out with H2ORandomForestEstimator in a python 3.6 environment. I noticed the results of the predict method was giving values between 0 to 1(I am assuming this is the…
Rahul
  • 44,892
  • 25
  • 73
  • 103
11
votes
1 answer

Is it possible to get a feature importance plot from a h2o.automl model?

I have a binary classification problem, and I am using "h2o.automl" to obtain a model. Is it possible to obtain a plot of the importances of my dataset features from the "h2o.automl" model? A pointer to some python 3 code would be much…
user274610
  • 509
  • 9
  • 18
11
votes
2 answers

Error with H2O in R - can't connect to local host

I can't get the h2o to work in my R. It shows the following error. Have no clue what it means. Previously it gave me an error because I didn't have Java 64 bit version. I downloaded the 64bit - restarted my pc - and started the process again and now…
Mayur
  • 179
  • 1
  • 2
  • 12
11
votes
6 answers

Python cannot find package h2o in anaconda

When I try to import h2o I am told that the package does not exist. When I try to install it, it tells me it already exists. I have tried wiping it out of my computer and reinstalling to no avail. At this point all I can think is some environment…
mlanier
  • 167
  • 2
  • 3
  • 14
11
votes
2 answers

How to understand the metrics of H2OModelMetrics Object through h2o.performance

After creating the model using h2o.randomForest, then using: perf <- h2o.performance(model, test) print(perf) I get the following information (value H2OModelMetrics object) H2OBinomialMetrics: drf MSE: 0.1353948 RMSE: 0.3679604 LogLoss: …
David Leal
  • 6,373
  • 4
  • 29
  • 56
11
votes
2 answers

R: Plot trees from h2o.randomForest() and h2o.gbm()

Looking for an efficient way to plot trees in rstudio, H2O's Flow or in local html page from h2o's RF and GBM models similar to the one in the image in link below. Specifically, how do you plot trees for the objects, (fitted models) rf1 and gbm2…
Webby
  • 337
  • 1
  • 3
  • 10
10
votes
2 answers

H2O AutoML error Test/Validation dataset has a non-categorical column which is categorical in the training data" on predict

I have trained and saved my H2O AutoML model. after reloading, while I am using predict method, I am getting below error: java.lang.IllegalArgumentException: Test/Validation dataset has a non-categorical column 'response' which is categorical in the…
ATUL AGARWAL
  • 101
  • 1
  • 3
10
votes
4 answers

How to find best params of leader model in automl h2o python

I trained h2o automl and got a leader model with satisfying metrics. I want to retrain the model periodically but without using checkpoint. So, I guess all I need are the best parameters of the leader model to run it manually. I know…
1
2 3
99 100