1

I tried training an AutoEnsembleEstimator with two DNNEstimators (with hidden units of 1000,500, 100) on a dataset with around 1850 features (after feature engineering), and I kept running out of memory (even on larger 400G+ high-mem gcp vms).

I'm using the above for binary classification. Initially I had trained various models and combined them by training a traditional ensemble classifier over the trained models. I was hoping that Adanet would simplify the generated model graph that would make the inference easier, rather than having separate graphs/pickles for various scalers/scikit models/keras models.

Sridhar Iyer
  • 2,772
  • 1
  • 21
  • 28

1 Answers1

1

Three hypotheses:

  1. You might have too many DNNs in your ensemble, which can happen if max_iteration_steps is too small and max_iterations is not set (both of those are constructor arguments to AutoEnsembleEstimator). If you want to train each DNN for N steps, and you want an ensemble with 2 DNNs, you should set max_iteration_steps=N, set max_iterations=2, and train the AutoEnsembleEstimator for 2N steps.

  2. You might have been on adanet-0.6.0-dev, which had a memory leak. To fix this, try updating to the latest release and seeing if this problem still arises.

  3. Your batch size might have been too large. Try lowering your batch size.

Scott Yak
  • 21
  • 2