1

I'm trying to build a DNNRegressor to learn from 196 features to predict 1 label, all real numbers.

I've tried multiple variations of feeding data & batches but nothing seems to work... the output from fit() stays in INFO:tensorflow:loss = 1.59605e+32 and when trying to predict the same training data, the output is way off the range of my label (which is between -1.7 to 2.6, but I get predictions like: 2.9873503e+09)

Can anyone help, what I'm doing wrong?

My code below:

import pandas as pd
import tensorflow as tf

df_train = pd.read_csv("...", delimiter="\t", index_col=0)
LABEL = 'y'
COLUMNS = list(df_train.columns.values)
COLUMNS = filter(lambda a: a != LABEL, COLUMNS)

def my_input_fn(df):
    continuous_cols = {k: tf.constant(df[k].values, shape=[df[k].size, 1]) for k in COLUMNS}
    labels = tf.constant(df[LABEL].values)
    return continuous_cols, labels

continuous_features = [tf.contrib.layers.real_valued_column(k) for k in COLUMNS]
regressor = tf.contrib.learn.DNNRegressor(feature_columns=continuous_features, hidden_units=[20,10], model_dir="...")
regressor.fit(input_fn=lambda: my_input_fn(df_train), steps=20000)
results = regressor.evaluate(input_fn=lambda: my_input_fn(df_test),steps=1)

I'm running tf with gpu support. One thing that I noticed is that when first calling the fit() function, I get:

E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.94G (4233691136 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.94G (4233691136 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 3.94G (4233297920 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

But it still runs after that. Many thanks!

Update: I've noticed that some of the input columns are all zeroes. When I remove them, the network learns and converges. I've tried to input these columns as categorical columns (binary) but this also makes the learning not converge.

beastiecho
  • 11
  • 2
  • For your out of memory error see https://stackoverflow.com/questions/39465503/cuda-error-out-of-memory-in-tensorflow and https://stackoverflow.com/questions/34514324/error-using-tensorflow-with-gpu/34514932#34514932 – BoboDarph Jul 06 '17 at 15:01

0 Answers0