45

I am trying to recreate a figure from a GGplot2 seminar http://dl.dropbox.com/u/42707925/ggplot2/ggplot2slides.pdf.

In this case, I am trying to generate Example 5, with jittered data points subject to a dodge. When I run the code, the points are centered around the correct line, but have no jitter.

Here is the code directly from the presentation.

set.seed(12345)
hillest<-c(rep(1.1,100*4*3)+rnorm(100*4*3,sd=0.2),
       rep(1.9,100*4*3)+rnorm(100*4*3,sd=0.2))
rep<-rep(1:100,4*3*2)
process<-rep(rep(c("Process 1","Process 2","Process 3","Process 4"),each=100),3*2)
memorypar<-rep(rep(c("0.1","0.2","0.3"),each=4*100),2)
tailindex<-rep(c("1.1","1.9"),each=3*4*100)
ex5<-data.frame(hillest=hillest,rep=rep,process=process,memorypar=memorypar, tailindex=tailindex)
stat_sum_df <- function(fun, geom="crossbar", ...) {stat_summary(fun.data=fun, geom=geom, ...) }

dodge <- position_dodge(width=0.9) 
p<- ggplot(ex5,aes(x=tailindex ,y=hillest,color=memorypar)) 
p<- p + facet_wrap(~process,nrow=2) + geom_jitter(position=dodge) +geom_boxplot(position=dodge)  
p
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
user1381239
  • 553
  • 1
  • 4
  • 5
  • 2
    Given that Didzis Elferts has provided a better answer using `position_jitterdodge` available in ggplot2 version 1.0.0, you should un-accept my answer and accept the answer provided by Didzis Elferts. – Sandy Muspratt Jun 04 '14 at 00:26

2 Answers2

75

In ggplot2 version 1.0.0 there is new position named position_jitterdodge() that is made for such situation. This postion should be used inside the geom_point() and there should be fill= used inside the aes() to show by which variable to dodge your data. To control the width of dodging argument dodge.width= should be used.

ggplot(ex5, aes(x=tailindex, y=hillest, color=memorypar, fill=memorypar)) +
      facet_wrap(~process, nrow=2) +
      geom_point(position=position_jitterdodge(dodge.width=0.9)) +
      geom_boxplot(fill="white", outlier.colour=NA, position=position_dodge(width=0.9))

enter image description here

Alan
  • 3,153
  • 2
  • 15
  • 11
Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
  • Thank you for this post. this is very helpful. I am currently writing my own code, which is working well. But, I somehow get black colour for outliers. Can you guess what could be potential causes? – jazzurro Nov 07 '14 at 09:13
  • 1
    Black color is default color for outliers. In this code for `geom_boxplot()` `outlier.colour=` is set to `NA` to not show them. – Didzis Elferts Nov 07 '14 at 09:17
  • 1
    Hi, I just found the solution. `outlier.colour` has to be spelled colour, not color. It seems that American spell is not favoured here. Thanks for your reply. :-) – jazzurro Nov 07 '14 at 09:18
  • @DidzisElferts, `outlier.colour = NA` does not work in case of Rshiny/plotly display. is there any workaround? – user5249203 Mar 20 '18 at 17:08
  • You can also try use color ‘transparent’ – alexo4 Jul 09 '18 at 17:04
40

EDIT: There is a better solution with ggplot2 version 1.0.0 using position_jitterdodge. See @Didzis Elferts' answer. Note that dodge.width controls the width of the dodging and jitter.width controls the width of the jittering.

I'm not sure how the code produced the graph in the pdf.

But does something like this get you close to what you're after?

I convert tailindex and memorypar to numeric; add them together; and the result is the x coordinate for the geom_jitter layer. There's probably a more effective way to do it. Also, I'd like to see how dodging geom_boxplot and geom_jitter, and with no jittering, will produce the graph in the pdf.

library(ggplot2)
dodge <- position_dodge(width = 0.9)
ex5$memorypar2 <- as.numeric(ex5$tailindex) + 
  3 * (as.numeric(as.character(ex5$memorypar)) - 0.2) 

p <- ggplot(ex5,aes(x=tailindex , y=hillest)) +
   scale_x_discrete() +
   geom_jitter(aes(colour = memorypar, x = memorypar2), 
     position = position_jitter(width = .05), alpha = 0.5) +
   geom_boxplot(aes(colour = memorypar), outlier.colour = NA, position = dodge) +
   facet_wrap(~ process, nrow = 2)
p

enter image description here

Sandy Muspratt
  • 31,719
  • 12
  • 116
  • 122
  • Thanks, I wish there was an elegant way within ggplot2, but this workaround gets everything done. Thanks again! – user1381239 May 09 '12 at 20:24