0

I have small dataset, I did clustering with features of dataset,not label is included, so I have now predicted class and labels. I need to evaluate it with sklearn.metrics. But how do I know which label is corresponding to which prediction value? I had 7 labels and got 7 predictions for them.

1

Here b5 is 1 maybe or b2 is 2, so I can not match them. Please guide me.

def kmeans(df):
  from sklearn.cluster import KMeans #importing the libery
  km=KMeans(n_clusters=len(df['class'].unique()))
  global y_predicted # I making it global so i can use it outside of the function
  y_predicted=km.fit_predict(df[['x','y']])
  df['prediction']=y_predicted
  return df
df=kmeans(df) 

I tried to give label encoder for class column, but it also gave different labels, it did not match. Thanks in advance.

vimuth
  • 5,064
  • 33
  • 79
  • 116
  • Does this answer your question? [sklearn: calculating accuracy score of k-means on the test data set](https://stackoverflow.com/questions/37842165/sklearn-calculating-accuracy-score-of-k-means-on-the-test-data-set) – Marijn Jan 07 '23 at 19:28
  • Also related: https://stackoverflow.com/questions/51320227/determining-accuracy-for-k-means-clustering, https://stackoverflow.com/questions/37842165/sklearn-calculating-accuracy-score-of-k-means-on-the-test-data-set, https://stackoverflow.com/questions/59346797/what-is-the-accuracy-of-a-clustering-algorithm, https://stackoverflow.com/questions/64364851/how-to-evaluate-k-means-clustering-since-automatic-indexes-of-clusters-dont-mat – Marijn Jan 07 '23 at 19:31

0 Answers0