I am trying to implement a K-Means algorithm into my binary classification task, but I cannot plot a scatter graph of the resulting two clusters.
My dataset is simply in the following form:
# size, class
312, 1
319 1
227 0
The minimal example:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.cluster import KMeans
X = {'size': [312,319,227,301,273,311,277,291,303,381], 'class': [1,1,0,1,0,1,0,0,1,1]}
X = pd.DataFrame(data=X)
X_train, X_test, y_train, y_test = train_test_split(X['size'], X['class'], test_size=0.4)
X_train = X_train.values.reshape(-1,1)
X_test = X_test.values.reshape(-1,1)
kmeans = KMeans(init="k-means++", n_clusters=2, n_init=10, max_iter=300, random_state=42)
kmeans.fit(X_train)
preds = kmeans.predict(X_test)
How can I plot a scatter plot that shows the two clusters, the samples in "X_test" and corresponding colors (for 0 and 1) according to the predictions "preds"?