I was going through the "Writing Our First Classifier - Machine Learning Recipes #5" Machine Learning Video on Youtube. I followed along the example, but am not sure why I am not able to get the code running.
Note: This isn't the final code for the KNN Classifier. It is the initial testing phase.
#implementing KNN Classifier without using import statement
import random
class ScrappyKNN():
def fit(self, X_train, Y_train):
self.X_train=X_train
self.Y_train=Y_train
def predict(self, X_test):
predictions=[]
for row in X_test:
label = random.choice(self.Y_train)
predictions.append(label)
return predictions
from sklearn.datasets import load_iris
iris=load_iris()
X=iris.data
Y=iris.target
from sklearn.cross_validation import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.5)
clf=ScrappyKNN()
clf.fit (X_train,Y_train)
predictions_result=clf.predict(X_test)
from sklearn.metrics import accuracy_score
print(accuracy_score(Y_test,predictions_result))
I am getting the error "ValueError: Found input variables with inconsistent numbers of samples: [75, 1]". I believe there is some size inconsistency in the list as training and testing data sets are split out of the 150 into 75 samples each (I have used test_size=0.5). I am really stuck in this one. Could you Kindly tell what this error means. I searched through similar answers on stack overflow but unfortunately, can't make out what causes this error. I'm new to using Python for Machine Learning.Can someone Kindly help me out?
Here is the full Stacktrace
/Users/joyjitchatterjee/anaconda3/envs/machinelearning/bin/python /Users/joyjitchatterjee/PycharmProjects/untitled1/ml_5.py
Traceback (most recent call last):
File "/Users/joyjitchatterjee/PycharmProjects/untitled1/ml_5.py", line 36, in <module>
print(accuracy_score(Y_test,predictions_result))
File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 176, in accuracy_score
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 71, in _check_targets
check_consistent_length(y_true, y_pred)
File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/utils/validation.py", line 173, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [75, 4]
Process finished with exit code 1