Performing k means cluster analysis, how can I reorganize the data into individual clusters?

Question

I am performing a k-means cluster analysis on a data frame with 62 variables: Tapping number 1-62 and 75000 columns. How can I organize the data frame into individual clusters?

I used fviz_cluster to visualize the clusters:

r_fit = kmeans(pressure_rotate, 5, nstart = 25)
fviz_cluster(r_fit,data = pressure_rotate)

and I was able to access a table for which variable belongs to which cluster with r_fit$cluster command, but how can I reorganize the data so that I can see what each cluster contains? Like something along the lines of:

cluster 1: Tapping number 3, Tapping number 5, Tapping number 12, ...
cluster 2: Tapping number 7, tapping number 9, ....
etc

What does "Tapping number" mean? I don't understand what you are looking for. — Phil, Feb 23 '23 at 15:22
It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Feb 23 '23 at 16:48

score 1 · Answer 1 · answered Feb 24 '23 at 04:26

You have 62 rows/observations and 75000 columns/variables. Is that correct? Not 62 variables. It is not clear if "Tapping number" is a column in your data or just the row number. Here is an example using the iris data included in R:

data(iris)  # 150 rows, 4 numeric variables, one species variable
iris.km <- kmeans(iris[, -5], 3, nstart=25)   # Exclude species variable
fviz_cluster(iris.km, iris[, -5])       # Make a plot showing the clusters
split(rownames(iris), iris.km$cluster)  # Show cluster membership by row name
# $`1`
#  [1] "51"  "52"  "54"  "55"  "56"  "57"  "58"  "59"  "60"  "61"  "62"  "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71"  "72"  "73"  "74"  "75"  "76"  "77" 
# [27] "79"  "80"  "81"  "82"  "83"  "84"  "85"  "86"  "87"  "88"  "89"  "90"  "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98"  "99"  "100" "102" "107" "114" "115"
# [53] "120" "122" "124" "127" "128" "134" "139" "143" "147" "150"
# 
# $`2`
#  [1] "53"  "78"  "101" "103" "104" "105" "106" "108" "109" "110" "111" "112" "113" "116" "117" "118" "119" "121" "123" "125" "126" "129" "130" "131" "132" "133"
# [27] "135" "136" "137" "138" "140" "141" "142" "144" "145" "146" "148" "149"
# 
# $`3`
#  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32"
# [33] "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50"

Performing k means cluster analysis, how can I reorganize the data into individual clusters?

1 Answers1