So, I've been using KNN on a set of data, with a random_state = 4 during the train_test_split phase. Despite of using the random state, the output of accuracy, classification report, prediction, etc, are different each time. Was wondering why was that?
Here's the head of the data: (predicting the position based on all_time_runs and order)
order position all_time_runs
0 10 NO BAT 1304
1 2 CAN BAT 7396
2 3 NO BAT 6938
3 6 CAN BAT 4903
4 6 CAN BAT 3761
And here's the code for the classification and prediction:
#splitting data into features and target
X = posdf.drop('position',axis=1)
y = posdf['position']
knn = KNeighborsClassifier(n_neighbors = 5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)
#fitting the KNN model
knn.fit(X_train, y_train)
#predicting with the model
prediction = knn.predict(X_test)
#knn score
score = knn.score(X_test, y_test)