4

How does one distinguish 4 different factors (not using size)? Is it possible to use hollow and solid points to distinguish a variable in ggplot2?

test=data.frame(x=runif(12,0,1),
     y=runif(12,0,1),
     siteloc=as.factor(c('a','b','a','b','a','b','a','b','a','b','a','b')),
     modeltype=as.factor(c('q','r','s','q','r','s','q','r','s','q','r','s')),
     mth=c('Mar','Apr','May','Mar','Apr','May','Mar','Apr','May','Mar','Apr','May'),
     yr=c(2010,2011,2010,2011,2010,2011,2010,2011,2010,2011,2010,2011))

where x are observations and y are modeling results and I want to compare different model versions across several factors. Thanks!

nograpes
  • 18,623
  • 1
  • 44
  • 67
Dominik
  • 782
  • 7
  • 27

3 Answers3

5

I think , it very difficult visually to distinguish/compare x and y values according to 4 factors. I would use faceting and I reduce the number of factors using interaction for example.

Here an example using geom_bar:

enter image description here

set.seed(10)
library(reshape2)
test.m <- melt(test,measure.vars=c('x','y'))
ggplot(test.m)+
  geom_bar(aes(x=interaction(yr,mth),y=value,
                 fill=variable),stat='identity',position='dodge')+
  facet_grid(modeltype~siteloc)
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Thanks @agstudy. The more I mess with this the more I like these bargraphs (although I flipped facet_grid(site~model.type) because the intermodel comparison is the important part). One thing, is there a way to order x-axis chronologically instead of alphabetically? – Dominik Jul 29 '13 at 16:07
3

You can use hollow and solid points, but only with certain shapes as described in this answer.

So, that leaves you with fill, colour, shape, and alpha as your aesthetic mappings. It looks ugly, but here it is:

ggplot(test, aes(x, y,
                 fill=modeltype,
                 shape=siteloc,
                 colour=mth,
                 alpha=factor(yr)
                 )) + 
geom_point(size = 4) + 
scale_shape_manual(values=21:25) +
scale_alpha_manual(values=c(0.35,1))

Ugly, but I guess it is what you asked for. (I haven't bothered to figure out what is happening with the legend -- it obviously isn't displaying the borders right.)

Ugly

If you want to map a variable to a kind of custom aesthetic (hollow and solid), you'll have to go a little further:

test$fill.type<-ifelse(test$yr==2010,'other',as.character(test$mth))
cols<-c('red','green','blue')

ggplot(test, aes(x, y,
                 shape=modeltype,
                 alpha=siteloc,
                 colour=mth,
                 fill=fill.type
)) + 
  geom_point(size = 10) + 
  scale_shape_manual(values=21:25) +
  scale_alpha_manual(values=c(1,0.5)) +
  scale_colour_manual(values=cols) +
  scale_fill_manual(values=c(cols,NA))

Still ugly

Still ugly, but it works. I don't know a cleaner way of mapping both the yr to one colour if it is 2010 and the mth if not; I'd be happy if someone showed me a cleaner way to do that. And now the guides (legend) is totally wrong, but you can fix that manually.

Community
  • 1
  • 1
nograpes
  • 18,623
  • 1
  • 44
  • 67
  • Except none of these are hollow. Hollow and filled would work well for either siteloc or year because it's dichotomous. Shape for modeltype. Ignoring year for now, I'd like red filled square to indicate apr, site a, modeltype q. red hollow square to indicate apr, site b, modeltype q. green filled circle to indicate mar, site a, modeltype r. green hollow circle: mar, b, r. green fill diamond: mar,a,s. etc. The fill color should match the border color (when filled) or i suppose be white when 'hollow'. Does that make sense? – Dominik Jul 28 '13 at 19:02
  • @Dominik I updated my answer to map the variables to a different aesthetic (hollow and solid). – nograpes Jul 30 '13 at 02:52
3

I really like using interaction by agstudy - I would probably try this first. But if keeping things unchanged then:

4 factors could be accomodated with faceting and 2 axes. Then there are 2 metrics x and y: one option is a bubble chart with both metrics distinguishing by color or shape or both (added jitter to make shapes less overlapping):

testm = melt(test, id=c('siteloc', 'modeltype', 'mth', 'yr'))

# by color
ggplot(testm, aes(x=siteloc, y=modeltype, size=value, colour=variable)) +
  geom_point(shape=21, position="jitter") +
  facet_grid(mth~yr) +
  scale_size_area(max_size=40) +
  scale_shape(solid=FALSE) +
  theme_bw()

Metrics distinguished by color

#by shape
testm$shape = as.factor(with(testm, ifelse(variable=='x', 21, 25)))

ggplot(testm, aes(x=siteloc, y=modeltype, size=value, shape=shape)) +
  geom_point(position="jitter") +
  facet_grid(mth~yr) +
  scale_size_area(max_size=40) +
  scale_shape(solid=FALSE) +
  theme_bw() 

enter image description here

# by shape and color
ggplot(testm, aes(x=siteloc, y=modeltype, size=value, colour=variable, shape=shape)) +
  geom_point(position="jitter") +
  facet_grid(mth~yr) +
  scale_size_area(max_size=40) +
  scale_shape(solid=FALSE) +
  theme_bw()

enter image description here

UPDATE:

This is attempt based on 1st comment by Dominik to show if (x,y) is above or below 1:1 line and how big is the ratio x/y or y/x - blue triangle is if x/y>1, red circle otherwise (no need in melt in this case):

test$shape = as.factor(with(test, ifelse(x/y>1, 25, 21)))
test$ratio = with(test, ifelse(x/y>1, x/y, y/x))

ggplot(test, aes(x=siteloc, y=modeltype, size=ratio, colour=shape, shape=shape)) +
  geom_point() +
  facet_grid(mth~yr) +
  scale_size_area(max_size=40) +
  scale_shape(solid=FALSE) +
  theme_bw()

enter image description here

topchef
  • 19,091
  • 9
  • 63
  • 102
  • I think I like this better than using `interaction` but it's rather meaningless in this context. larger/smaller is not specifically of interest, but rather above or below 1:1 line. Is there an easy transformation/stat to make the circles more meaningful? – Dominik Jul 28 '13 at 19:22
  • in other words, the interest is not in (x,y) but in x/y ratio, in particular, how much it's > or < 1, correct? – topchef Jul 29 '13 at 03:19
  • yes. I think I have something that I like, I actually removed the shape because I find different sized circles easier to compare (with red=underprediction, blue=overprediction). Is there a page that lists the shape types, I didn't see it when perusing docs.ggplot2.org/current? Also, is there anything less abstract to visually compare? I find linear distance more intuitive comparing size. Or perhaps it'd be possible to increase the number of ratio circles in the legend? Thank you for all your help – Dominik Jul 29 '13 at 15:45
  • @Dominik this is [one of pages with shapes](http://www.cookbook-r.com/Graphs/Shapes_and_line_types/) – topchef Jul 29 '13 at 16:01