-4

I am making a k means cluster in R. Can anyone help me identify that if clusters are formed and I want to access the data points belonging to a particular cluster, how exactly can I do it ?

RHertel
  • 23,412
  • 5
  • 38
  • 64
  • 1
    Please provide some code. It will make it much easier to answer in a helpful way. – RHertel May 26 '16 at 06:33
  • 1
    Welcome to Stack Overflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 May 26 '16 at 06:44

1 Answers1

2

Here's a simple example: Clustering of the iris set with k-means based on the values of Petal.Length and Petal.Width using three centers:

k_cluster <- kmeans(iris[c("Petal.Length","Petal.Width")], 3)
#>k_cluster
#K-means clustering with 3 clusters of sizes 50, 54, 46
#
#Cluster means:
#  Petal.Length Petal.Width
#1     1.462000    0.246000
#2     4.292593    1.359259
#3     5.626087    2.047826

#Clustering vector:
#  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2
# [56] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 2 3 3 3
#[111] 3 3 3 3 3 3 3 3 3 2 3 3 3 2 3 3 2 2 3 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3

#Within cluster sum of squares by cluster:
#[1]  2.02200 14.22741 15.16348
# (between_SS / total_SS =  94.3 %)

The assignment of the entries (rows of the iris set) to one of the three clusters is stored in the clustering vector k_cluster$cluster. Therefore, to access the entries that belong to, say, cluster number 3 one could use

iris[k_cluster$cluster==3,]
#> head(iris[k_cluster$cluster==3,])
#   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#51          7.0         3.2          4.7         1.4 versicolor
#52          6.4         3.2          4.5         1.5 versicolor
#54          5.5         2.3          4.0         1.3 versicolor
#55          6.5         2.8          4.6         1.5 versicolor
#56          5.7         2.8          4.5         1.3 versicolor
#57          6.3         3.3          4.7         1.6 versicolor

There are also several ways to visualize the clusters. However, in view of the general form in which the question is currently formulated, it does not seem to be appropriate to go into further details.

Hope this helps.

RHertel
  • 23,412
  • 5
  • 38
  • 64