2

Could you please recommend the best way to visualize data with four variables in any of the available R packages.

Namely, I have two categorical variables(populations(12) and characters(50)) and two continuous variables (mean and coefficient of variation of each character length measurement for 100 individuals (rows in a matrix)). So it is basically a 12x50x100x100 dimensional graph.

Any suggestions?

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
Fedja Blagojevic
  • 813
  • 1
  • 10
  • 18
  • What do you want to show/display? – chl Feb 22 '12 at 13:05
  • Cloudplot ( http://stackoverflow.com/q/6774777/636656 ) gets you three dimensions. Just need to subset over the fourth. So 12 cloudplots. Although it would be not super easy to interpret the results. – Ari B. Friedman Feb 22 '12 at 13:45
  • It sounds to me that you don't really have the data that you think you have. If all you have about each of the 12 pops and 50 chars is the mean and SD, then they are not really "continuous data", but rather summaries of that data. – IRTFM Feb 22 '12 at 14:14
  • Yes yes just a summary stats, but in fact I want it to be as compact as possible (certainly not as table) because it shoud be a part of a paper that already has too many graphs and especially tables. – Fedja Blagojevic Feb 23 '12 at 09:00
  • @gsk3 Cloudplots are great but I would like to pack the results in a 2D plane rather than 3D model. – Fedja Blagojevic Feb 23 '12 at 09:17

2 Answers2

2

I would plot the variables first one by one, then together, starting with the whole population and progressively slicing the data into the various groups.

# Sample data
n1 <- 6   # Was: 12
n2 <- 5   # Was: 50
n3 <- 10  # Was: 100
d1 <- data.frame(
  population = rep(LETTERS[1:n1], each=n2*n3),
  character = rep(1:n2, each=n3, times=12),
  id = 1:(n1*n2*n3),
  mean = rnorm(n1*n2*n3),
  var  = rchisq(n1*n2*n3, df=5)
)
# Not used, but often useful with ggplot2
library(reshape2)
d2 <- melt(d1, id.vars=c("population","character","id"))

# Look at the first variable
library(lattice)
densityplot( ~ mean, data=d1 )
densityplot( ~ mean, groups=population, data=d1 )
densityplot( ~ mean | population, groups=character, data=d1 )

# Look at the second variable
densityplot( ~ var, data=d1 )
densityplot( ~ var, groups=population, data=d1 )
densityplot( ~ var | population, groups=character, data=d1 )

# Look at both variables
xyplot( mean ~ var, data=d1 )
xyplot( mean ~ var, groups=population, data=d1 )
xyplot( mean ~ var | population, groups=character, data=d1 )

# The plots may be more readable with lines rather than points
xyplot( 
  mean ~ var | population, groups = character, 
  data = d1, 
  panel = panel.superpose, panel.groups = panel.loess
)
Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78
0

Consider lattice if you want to plot a series of "slices" along one dimension or another of your data. Why not pop on over to http://addictedtor.free.fr/graphiques/ and see if someone's written some code to create the kind of graph you want?

Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
  • Lattice is the key because of multiple plot windows easily. I think that the idea of putting it all on one graph is starting to dim and will probably get rejected. – Fedja Blagojevic Feb 23 '12 at 09:18