-8

How can i plot a data-set with 6 Dimensions in a plot with 2 Dimensions.

I have a dataset with 6 attributes and more than 1000 rows which I am using for k-means clustering.

Now I want to visualise the data after I perform clustering. Could someone give me any hints on how to approach this? Thanks.

Rontogiannis Aristofanis
  • 8,883
  • 8
  • 41
  • 58
kaxil
  • 17,706
  • 2
  • 59
  • 78
  • 1
    Questions about what is the best type of plot aren't really programming questions. If you want recommendations for statistical visualizations of your data, [stats.se] might be a better place for your question. If you know what plot you want to make but don't know *how* to make it, then that would be a question for Stack Overflow (especially if you include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)) – MrFlick Dec 17 '15 at 15:27
  • Alternatively, depending on your data, you could get funky in `ggplot2` and map one var to x, one to y, one to size, fill color, border color, shape, ... Or do it in 3D using `rgl`. – lukeA Dec 17 '15 at 17:19
  • Yes, I know that I need a 2 Dimensional plot but what command I need to use in R is a question. – kaxil Dec 18 '15 at 19:59

2 Answers2

1

pairs() might be useful.

Set up data (unstructured, because it's easier to do that way):

set.seed(101)
x <- rnorm(6000,nrow=6)
clust <- sample(1:5,size=1000,replace=TRUE)

Now plot (gap=FALSE is cosmetic; pch="." makes the plotting much faster for large data sets):

pairs(x,gap=FALSE,col=clust,pch=".")

enter image description here

This only shows you two-dimensional slices (i.e., you might miss higher-dimensional structure in your data this way), but it's better than nothing. If you really want to visualize higher-dimensional structure you might try something like rggobi ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
0

The simplest thing would be to use PCA to reduce the dimensionality of your data to 2 or 3 dimensions. k-means clustering ought to assign a group to each row of your data so you can easily plot the different groups on the reduced dataset. Here's a simple way to do PCA though you could also do LLE or other forms of dimensional reduction.

data(iris)
unique(iris$Species)
#[1] setosa     versicolor virginica
iris.pca<-princomp(iris[,c("Sepal.Length", "Sepal.Width", "Petal.Width", "Petal.Width")], center=T, scale=T)


plot(iris.pca$scores[,1], iris.pca$scores[,2], col=iris$Species)
Gene Burinsky
  • 9,478
  • 2
  • 21
  • 28