0

I have been doing the basic linear regression TensorFlow tutorial: https://www.tensorflow.org/tutorials/estimator/linear And I have arrived at the end whereas I have created an evaluation set input function and a feature column and have predicted the probabilities of the evaluation set. However, one thing I don't understand is that, machine learning is supposed to train using the training set and then predict labels using features and thus extrapolate in a sense. However, I'm feeding it both labels and features through my evaluation function:

def make_input_fn(data_df, label_df, num_epochs=10, shuffle=True, batch_size=32):
  def input_function():
    ds = tf.data.Dataset.from_tensor_slices((dict(data_df), label_df))
    if shuffle:
      ds = ds.shuffle(1000)
    ds = ds.batch(batch_size).repeat(num_epochs) 
    return ds
  return input_function
train_input_fn = make_input_fn(dftrain, y_train)
dfeval = pd.read_csv("https://storage.googleapis.com/tf-datasets/titanic/eval.csv")
y_eval = dfeval.pop("survived")
eval_input_fn = make_input_fn(dfeval, y_eval, 1, False)
...
result = list(linear_est.predict(eval_input_fn))

How may I feed my model a single input node and get back a predicted output label? As in, instead of getting a probability back, I get an actual value back.

Whiteclaws
  • 902
  • 10
  • 31
  • Most of the classifiers used in practice (including Tensorflow's linear classifier here) actually produce probabilistic outputs; from then, getting "hard" class predictions (i.e. class `0/1` in the binary case) is the result of a simple thresholding operation. See [Predict classes or class probabilities?](https://stackoverflow.com/questions/51367755/predict-classes-or-class-probabilities) and [High AUC but bad predictions with imbalanced data](https://stackoverflow.com/questions/51190809/high-auc-but-bad-predictions-with-imbalanced-data) and the links therein for more. – desertnaut Jul 29 '20 at 01:04
  • @desertnaut what about numeric labels with infinite values say floats? How would find the correct output for a set of input features? Say if you wanted to predict the age at which people survive the most? – Whiteclaws Jul 29 '20 at 01:25
  • This is regression, and not classification. Different problem, different methods and algorithms, and nothing to do with probabilistic outputs, like here (we usually don't even call them *labels*) – desertnaut Jul 29 '20 at 01:27

0 Answers0