1

I was running KNN for my dataset, for which I had to impute the missing values and then transform the variables so that they can lie between 0 and 1.

I have to use this predicted results as inferred performance and make a TTD Model for the same.

When I use predict I can get the predicted probabilities but I am unable to transfer these results into the base dataset, so that it can be used to infer the performance.

Please find the sample code below -

train=pandas.read_csv("dev_in.csv")
y_train = train['Y']
w_train = train['WT']
x_train1 = train[[‘ABC’,’GEF’,’XYZ’]].replace(-1, numpy.NaN)
values = x_train1.values
imputer = Imputer()
#replacing with mean
x_train_trf = imputer.fit_transform(values) 
# count the number of NaN values in each column
print(numpy.isnan(x_train_trf).sum())
X_normalized = preprocessing.normalize(x_train_trf, norm='l2')

#similar data manipulations on test population
test=pandas.read_csv("oot_in.csv")
y_test = test['Y']
w_test = test['WT']
x_test1 = test[[‘ABC’,’GEF’,’XYZ’]].replace(-1, numpy.NaN)
print(numpy.isnan(x_test81).sum())
values_test = x_test1.values
imputer = Imputer()
#replacing with mean
x_test_trf = imputer.fit_transform(values_test) 
# count the number of NaN values in each column
print(numpy.isnan(x_test_trf).sum())
X_normalized_test = preprocessing.normalize(x_test_trf, norm='l2')

#fitting the KNN
knn = KNeighborsClassifier(n_neighbors=5, weights= 'distance', p=2)
knn.fit(X_normalized, y_train)

#checking prediction on the test population
y_pred_test = knn.predict(X_normalized_test) 
**test ['inferred'] = y_pred_test**
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-82-defc045e7eeb> in <module>()
----> 1 test ['inferred] = y_pred_test

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

The place where I try to create the variable inferred in the test dataset, I am getting the above error.

Your help will be greatly appreciated.

Beginner
  • 11
  • 2
  • 2
    Welcome to SO! Please share some minimal example code, that represents your problem. Otherwise it is impossible to help you. – jhoepken Aug 13 '18 at 11:32
  • Will share in sometime. Thank you. – Beginner Aug 13 '18 at 12:18
  • @JensHöpken I have updated the question with the code, it would be great if you could help me with the same. Thank you. – Beginner Aug 13 '18 at 15:22
  • no that is by mistake, when I was cleaning the code for you to make it a sample. – Beginner Aug 13 '18 at 15:25
  • it may be because you are working with floats and not ints – Liam Aug 13 '18 at 15:30
  • I have actually used the same code before- y_pred_test = knn.predict(X_normalized_test) test ['inferred'] = y_pred_test I feel because I have performed transformations in the variables of the dataset, and used the transformed variables for building the KNN, I am not able to get the inferred performance in the test dataset. – Beginner Aug 13 '18 at 15:33
  • what type of output does knn.predict(X_normalized_test) give? – Liam Aug 13 '18 at 15:36
  • it gives me, 1 or 0, implying whether the observation in question is good or bad, which was predicted by the KNN Model developed on a different dataset. – Beginner Aug 13 '18 at 15:38
  • is it an integer or float? – Liam Aug 13 '18 at 15:39
  • @Engineero which typo are you referring to, I can update that and get back to you. – Beginner Aug 13 '18 at 15:39
  • @Liam It is an integer – Beginner Aug 13 '18 at 15:40
  • Your error message still says `test['inferred] = y_pred_test`, which is definitely an `IndexError`, maybe a syntax error, since there is no closing bracket and the string inside the brackets makes no sense. This might be a holdover from when you had that same typo in the code you were showing, so if you re-run the code that you fixed, I would imagine you could get a different error message. At the very least the typo in the error message should be fixed. – Engineero Aug 13 '18 at 15:43

0 Answers0