2

I was going through the "Writing Our First Classifier - Machine Learning Recipes #5" Machine Learning Video on Youtube. I followed along the example, but am not sure why I am not able to get the code running.

Note: This isn't the final code for the KNN Classifier. It is the initial testing phase.

#implementing KNN Classifier without using import statement
import random

class ScrappyKNN():
    def fit(self, X_train, Y_train):
        self.X_train=X_train
        self.Y_train=Y_train

    def predict(self, X_test):
        predictions=[]
        for row in X_test:
            label = random.choice(self.Y_train)
            predictions.append(label)

            return predictions

from sklearn.datasets import load_iris
iris=load_iris()
X=iris.data 
Y=iris.target 

from sklearn.cross_validation import train_test_split

X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.5)

clf=ScrappyKNN()
clf.fit (X_train,Y_train)
predictions_result=clf.predict(X_test)

from sklearn.metrics import accuracy_score

print(accuracy_score(Y_test,predictions_result))

I am getting the error "ValueError: Found input variables with inconsistent numbers of samples: [75, 1]". I believe there is some size inconsistency in the list as training and testing data sets are split out of the 150 into 75 samples each (I have used test_size=0.5). I am really stuck in this one. Could you Kindly tell what this error means. I searched through similar answers on stack overflow but unfortunately, can't make out what causes this error. I'm new to using Python for Machine Learning.Can someone Kindly help me out?

Here is the full Stacktrace

/Users/joyjitchatterjee/anaconda3/envs/machinelearning/bin/python /Users/joyjitchatterjee/PycharmProjects/untitled1/ml_5.py
Traceback (most recent call last):
  File "/Users/joyjitchatterjee/PycharmProjects/untitled1/ml_5.py", line 36, in <module>
    print(accuracy_score(Y_test,predictions_result))
  File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 176, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/metrics/classification.py", line 71, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "/Users/joyjitchatterjee/anaconda3/envs/machinelearning/lib/python3.6/site-packages/sklearn/utils/validation.py", line 173, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [75, 4]

Process finished with exit code 1

Screenshot of code

JChat
  • 784
  • 2
  • 13
  • 33
  • May be [this thread](https://stackoverflow.com/questions/30813044/sklearn-found-arrays-with-inconsistent-numbers-of-samples-when-calling-linearre) will help! – Ketan Mukadam Jul 08 '18 at 05:54
  • Your class is returning `random.choice` with no training. You should clarify as someone may pick this code to be final one. – rnso Jul 08 '18 at 07:33

1 Answers1

0

The indentation of the final return statement is wrong in your code. It should be

def predict(self, X_test):
    predictions=[]
    for row in X_test:
        label = random.choice(self.Y_train)
        predictions.append(label)

    return predictions
Gambit1614
  • 8,547
  • 1
  • 25
  • 51
  • Now, the error has changed to "ValueError: Found input variables with inconsistent numbers of samples: [75, 4]". – JChat Jul 08 '18 at 06:47
  • @JoyjitChatterjee I ran your code with my updated edit for predict function and it is working correctly. Which Python and sklearn version are you using ? – Gambit1614 Jul 08 '18 at 07:03
  • I'm using Python 3.6.5 with sklearn version 0.19.0. I'm using Anaconda and Pycharm. – JChat Jul 08 '18 at 07:07
  • Can you post the screenshot of your code ? I think you are missing out on some indentation. Here see this https://repl.it/@gambit16/PeacefulSlushyCrash – Gambit1614 Jul 08 '18 at 07:13
  • Here is the screenshot. https://i.stack.imgur.com/ZCtM7.png I also checked your link and its working perfectly. Thanks. I believe there is some indentation problem which I am not noticing. I copied your indented code into my editor and it worked perfectly. Can you tell where I was missing it out on indentation? :) – JChat Jul 08 '18 at 07:21