1

I am new to machine learning. This is a binary classification problem. I want to figure out how to deal with testing data that doesn't include target (output).

Normally, I will use sklearn:

from sklearn.model_selection import train_test_split

if the data (training + test) includes target (output) value being part of the all the data. But in my case two separate files are given. The training data file includes the target value as part of the data, however, the testing data doesn't have target value. I was wondering how I can use an sklearn classification technique to deal with this situation. I have to validate the data to check the accuracy of the classification. You can use any toy example for explanation.

G1124E
  • 407
  • 1
  • 10
  • 20
  • 3
    This is always the case for a well designed ML project. You split the training data into a training and test set, train on the train set, test with the test set with targets, then you predict your trained model with the prediction (no-target) dataset and see if the results make sense – G. Anderson Aug 16 '19 at 20:28

1 Answers1

0

You may find some answers here. Among all these, I likes this one, more elegant.

Catalina Chircu
  • 1,506
  • 2
  • 8
  • 19