2

I did clustering on my dataset and it works so far.

Now I want to plot the clustering and with this code:

ggplot(mydata, aes(SalesRank, PageRank, colour= booksCluster$cluster))+ geom_point()

I get this result:

enter image description here

Now I want instead of the numbers of the clusters the third column of my origin dataset (e.g. 'XY').

How can I achieve this?

Edit:

Here is my data

 $ SalesRank: int  18083 9284 15794 14630 -1 23395 12095 991 653 33717 ...
 $ PageRank : num  0.01 0.01241 0.00753 0.00454 0.00301 ...
 $ Verlag   : Factor w/ 58 levels "-1TION-Z","A-1conda",..: 40 33 33 33 33 57 33
moses
  • 155
  • 3
  • 15

1 Answers1

0

You will have to change the level names of booksCluster$cluster. If this is not a factor yet, you will have to coerce it to one. You can use levels(bookClusters$cluster) <- c(...) where ... is a vector of new names.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197