2

I am using random forest in h2o. But I don't understand the meaning of the parameters in the returned result. This is my original data. enter image description here

I would have liked to see results like this: (I set number of trees = 3 and response column = "Play".)

tree1:
Wind = false: yes {no=0, yes=6}
Wind = true
|   Temperature > 77.500: no {no=2, yes=0}
|   Temperature ≤ 77.500: yes {no=1, yes=5}

tree2:
Humidity > 92.500: no {no=3, yes=0}
Humidity ≤ 92.500: yes {no=2, yes=9}

tree3:
Wind = false: yes {no=0, yes=6}
Wind = true
|   Temperature > 77.500: no {no=2, yes=0}
|   Temperature ≤ 77.500: yes {no=1, yes=5}

But I got a model contains many parameters but results. This is my code and the results I got:

    DRFParametersV3 drfParams = new DRFParametersV3();
    drfParams.trainingFrame = H2oApi.stringToFrameKey("train");
    drfParams.validationFrame = H2oApi.stringToFrameKey("test");
    drfParams.ntrees=3;
    System.out.println("drfParams: " + drfParams);

    ColSpecifierV3 responseColumn = new ColSpecifierV3();
    responseColumn.columnName = ATT_LABEL_GOLF;
    drfParams.responseColumn = responseColumn;
    System.out.println("About to train DRF. . .");

    DRFV3 drfBody = h2o.train_drf(drfParams);
    System.out.println("drfParams: " + drfBody);

    JobV3 job = h2o.waitForJobCompletion(drfBody.job.key);
    System.out.println("DRF build done.");

    ModelKeyV3 modelKey = (ModelKeyV3)job.dest;
    ModelsV3 models = h2o.model(modelKey);
    System.out.println("models: " + models);
    System.out.println("models'size: " + models.models.length);

    DRFModelV3 model = (DRFModelV3)models.models[0];
    System.out.println("new DRF model: " + model);

And the result "DRFModelV3" is so confused. Where is the "forest" build by h2o? enter image description here

liyuhui
  • 1,210
  • 12
  • 17
  • this question is very similar to this question: https://stackoverflow.com/questions/37017165/r-plot-trees-from-h2o-randomforest-and-h2o-gbm. you can also take a look at this blog post: https://aichamp.wordpress.com/2017/09/27/visualizing-h2o-gbm-and-random-forest-mojo-models-trees-in-python/ – Lauren Aug 29 '18 at 16:50
  • But I don't need to plot it, so I wouldn't use some plot util class in h2o. I need to build a "tree" in java by myself, so I need the data of drf results generated by h2o. I am still not understanding how to get the real data in the "model" in h2o. – liyuhui Aug 30 '18 at 07:04
  • Do you have any example in java? – liyuhui Aug 30 '18 at 07:07

1 Answers1

1

One options is to download the MOJO, load it and use function _computeGraph on the MOJO object. Take a look at the H2O github repo to learn from the code.

please also take a look at the documentation on the POJOs and MOJOs here

Here some additional code that might help: https://github.com/h2oai/h2o-3/blob/43f8ab952a69a8bc9484bd0ffac909b6e3e820ca/h2o-algos/src/test/java/hex/XValPredictionsCheck.java#L59-L69

Lauren
  • 5,640
  • 1
  • 13
  • 19
  • PS: as you said in https://github.com/h2oai/h2o-3/blob/43f8ab952a69a8bc9484bd0ffac909b6e3e820ca/h2o-algos/src/test/java/hex/XValPredictionsCheck.java#L59-L69 , I can't do this. Because hex.tree package cannot be found. You can see my code below, I got a model "DRFModelV3", how can I use it to get a MOJO ? – liyuhui Aug 31 '18 at 07:36
  • Yes, Using "SharedTreeGraph g = ((DrfMojoModel) genModel)._computeGraph(treeToPrint);" can get a tree. – liyuhui Sep 03 '18 at 08:30