I am trying to fill NAs by cluster

Question

I have data "colleges". It has many NAs.

library(mlbench)
library(stats)

College <- read.csv("colleges.XL.csv", header=T)
na.college<- na.omit(College)

row.names(na.college) <- NULL

na.college[, c(4:23)] <- scale(as.matrix(na.college[,c(-1,-2,-3)]))
plot(hc<-hclust(dist(na.college[,c(-1,-2,-3)]),method="complete"),hang=-1)

a=11 

groups <- cutree(hc, a) # cut tree into "a" clusters
# draw dendogram with red borders around the "a" clusters 
rect.hclust(hc, a, border="red")

# your matrix dimensions have to match with the clustering results
# remove any columns from na.college, as you did for clustering
mat <- na.college


# select the columns based on the clustering results
cluster_1 <- mat[which(groups==1),]
cluster_2 <- mat[which(groups==2),]
cluster_3 <- mat[which(groups==3),]
cluster_4 <- mat[which(groups==4),]
cluster_5 <- mat[which(groups==5),]
cluster_6 <- mat[which(groups==6),]
cluster_7 <- mat[which(groups==7),]
cluster_8 <- mat[which(groups==8),]
cluster_9 <- mat[which(groups==10),]
cluster_11 <- mat[which(groups==11),]

cluster_1<-rbind(cluster_1[, -(1:3)], colMeans(cluster_1[, -(1:3)]))

From the standardized data, I made 11 cluster and 11 clusters' data sets. Now the original data, College, has one observation. It has many NAs but not all of it are NAs. However, Its column values are not standardized.

I want it to have standardized values except NAs so as to figure out which it should belong to among 11 clusters.

If you have any answers, please let me know.

When I run your code, I get this. `Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'colleges.XL.csv': No such file or directory`. Please provide enough context or a reproducible example so that we are able to help you. — Roman Luštrik, Nov 25 '13 at 14:10
see this [link](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to provide a reproducible example — TWL, Nov 25 '13 at 14:18

I am trying to fill NAs by cluster

0 Answers0