1

I'm trying to do a simple clustering example, but was wondering if there's a good way to make the colors align.

data = data.frame(iris$Petal.Length, iris$Petal.Width)
iris.kmeans <- kmeans(data, 3)

par(mfrow = c(1,2))
plot(x=iris$Petal.Length, y=iris$Petal.Width, col=iris.kmeans$cluster)
plot(x=iris$Petal.Length, y=iris$Petal.Width, col=iris$Species)

The output works, but gives me plots with color schemes that aren't aligned:

enter image description here

Is there a good way to force these two plots to have the same color schemes?

AI52487963
  • 1,253
  • 2
  • 17
  • 36
  • 2
    Probably you can set the factor levels in Petal.Length such that they "align" in the sense you're expecting here. There's no reason to think that'll extend beyond this, though `ord = c(1,3,2); with(iris, plot(Petal.Length, Petal.Width, col=factor(Species, levels=unique(Species)[ord])))` – Frank Mar 14 '17 at 18:21
  • This post might be relevant: http://stackoverflow.com/questions/31840378/how-to-avoid-recycling-of-colors-in-barplot-to-achieve-different-colors-within-e – Samuel Mar 14 '17 at 18:37
  • 2
    Note that (k-means) clustering is an undirected method or an unsupervised learning technique. This means that you tell it the number of clusters, but nothing else. Most importantly, you do not give it labels for the relevant output. This means that in a true clustering problem, the question cannot be solved. If, on the other hand, you are doing classification on a training set, where labels are known ahead of time, this "alignment" should be possible. You could even extend it to later stages of the analysis. – lmo Mar 14 '17 at 19:01

0 Answers0