How to find cluster centroid with Scikit-learn

Question

I have a data set with (labeled) clusters. I'm trying to find the centroids of each cluster (a vector that his distance is the smallest from all data points of the cluster).

I found many solutions to perform clustering and only then find the centroids, but I didn't find yet for existing ones.

Python schikit-learn is preferred. Thanks.

Have you got any code for what you have and have tried? Generally for finding the cluster centroid you just take the average of the feature vector for all examples in the cluster. Pandas-esk example `df.groupby("cluster").mean()` — Ken Syme, May 14 '18 at 14:30
Check [this](http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html). One of the attributes of `KMeans` is `cluster_centers_` — ninesalt, May 14 '18 at 14:35
@KenSyme That is what I did at first, but my supervisor said he didn't want to do it this way. — sheldonzy, May 14 '18 at 14:36
Please show what have you tried and where you are facing difficulties? If you are unsure about where to start, SO is not the place. [Start here](http://scikit-learn.org/stable/modules/neighbors.html#) — Vivek Kumar, May 14 '18 at 14:36
@ninesalt I saw it, but my data is already labeled and I'm not looking to perform kmeans — sheldonzy, May 14 '18 at 14:36

score 9 · Accepted Answer · answered May 14 '18 at 14:35

9

Straight from the docs:

from sklearn.neighbors.nearest_centroid import NearestCentroid
import numpy as np
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
y = np.array([1, 1, 1, 2, 2, 2])
clf = NearestCentroid()
clf.fit(X, y)

print(clf.centroids_)
# [[-2.         -1.33333333]
#  [ 2.          1.33333333]]

answered May 14 '18 at 14:35

sascha

32,238
6
68
110

4

FYI, under the hood this is just taking the mean (for euclidean distance) or median (for manhatten distance). – Ken Syme May 14 '18 at 14:41
You can then use clf.classes_ to match the computed centroids to the original data classes. – Unconventional Wisdom May 11 '20 at 14:27

How to find cluster centroid with Scikit-learn

1 Answers1

Linked