I have a k-means clustered data having 15 clusters. I have found 5 clusters with highest density and put their indexes in a list. Now i want to choose these clusters and remove them from my clustered data to visiulize the result. My new visiualized kmeans object should have n_clusters= 15-5=10 clusters in the end. Here is my clustered K-Means object
I used the code below to create that:
kmeans2 = KMeans(n_clusters = 15,random_state=10)
kmeans2.fit(X_train)
And here, i found the clusters i should remove:
centroids = kmeans2.cluster_centers_
from sklearn.neighbors import NearestNeighbors
nn = NearestNeighbors(n_neighbors=6) # 6 is not a typo. Explanation below.
nn.fit(centroids)
neigh_dist, neigh_ind = nn.kneighbors(centroids, return_distance=True)
densities = [5/np.sum(neigh_dist[i,:]) for i in range(centroids.shape[0])]
clusters_to_remove=[]
densities2=densities.copy()
def Nmaxelements(list1, N):
final_list = []
for i in range(0, N):
max1 = 0
for j in range(len(list1)):
if list1[j] > max1:
max1 = list1[j];
list1.remove(max1);
final_list.append(max1)
return final_list
final_list2=Nmaxelements(densities, 5)
clusters_to_remove=[]
for i in densities2:
for j in final_list2:
if(i==j):
clusters_to_remove.append(densities2.index(i))
print(clusters_to_remove)
the output is:
[2, 5, 12, 13, 14]
How can i remove these clusters to finally visiulize my kmeans object with 10 clusters?