3

I hope someone can help me as I'm fairly new to R and stack overflow.

I'm trying to create a set of bar plots indicating the p-value of differences in treated and untreated samples using R. I have found two other similar posts to mine (Indicating the statistically significant difference in bar graph USING R and Indicating the statistically significant difference in bar graph).

However, I was wondering whether there is a more 'automated' way of appropriately placing the labels and lines to indicate the statistical significance in the plots as is done in this previous post: Indicating the statistically significant difference in bar graph USING R? Whilst doing this manually does make some pretty graphs it is very time consuming.

Many thanks!

Example data (sorry, not sure how to upload it so imported from .csv):

Time,Dose,Variable,n,Mean,SD,Median,Upper.SEM,Lower.SEM
1,0,P,3,20.1341,1.049791,20,0.5728394,0.5569923
1,1,P,3,22.79528,1.110182,21.64,1.4179833,1.334943
6,0,P,3,38.63702,1.042969,37.74,0.9499892,0.9271918
6,1,P,3,24.25966,1.156925,23.82,2.1300073,1.9580866
24,0,P,3,42.3231,1.073583,43.75,1.7710033,1.6998725
24,1,P,3,13.78995,1.170568,13.15,1.3126463,1.1985573
48,0,P,3,36.01035,1.208213,35.63,4.1551262,3.7252776
48,1,P,3,23.3236,1.4403,20.65,5.4688355,4.4300848



g<- qplot(x=factor(Time), y=Mean, fill=factor(Dose),
      data=ExData, geom="bar", stat="identity",
      position="dodge")+ geom_errorbar(aes(ymax=Mean+Upper.SEM,
                                           ymin=Mean-Lower.SEM
      ),
      position=position_dodge(0.9),
      data=ExData, width=0.5)
g<-g+  xlab("Time (hrs)") 
g<-g+  ylab("Concentration (pmol/uL)") 
g<-g+ coord_cartesian(ylim=c(0, 50)) + scale_y_continuous(breaks=seq(0, 50, 5))
g<-g+ guides(fill=guide_legend(title="Dose (uM)"))
g<-g+ scale_fill_manual(values=c("red","blue"))
g<-g+ theme_bw()
g<-g+ theme(plot.title = element_text(face="bold", size=20))
g<-g+ theme(axis.title.x = element_text(face="bold", size=20))
g<-g+ theme(axis.title.y = element_text(face="bold", size=20))
g<-g+ theme(axis.text.x=element_text(face="bold",colour='black', size=20))
g<-g+ theme(axis.text.y=element_text(face="bold",colour='black', size=20))
g<-g+theme(axis.text=element_text(face="bold", size=20))
# Legend Title and label appearance
g<- g+theme(legend.title = element_text(colour="black", size=20, face="bold"))
g<- g + theme(legend.text = element_text(colour="black", size = 20, face = "bold"))
### Line for p-value 1uM vs 0uM at 1hr
g<-g+ annotate("text",x=1,y=27,label="p=0.1289")
g<- g+ annotate("segment", x = 0.8, xend = 0.8, y = 25, yend = 26,colour = "black")
g<- g+ annotate("segment", x = 1.2, xend = 1.2, y = 25, yend = 26,colour = "black")
g<- g+ annotate("segment", x = 0.8, xend = 1.2, y = 26, yend = 26, colour = "black")
### Line for p-value 1uM vs 0uM at 6hr
g<-g+ annotate("text",x=2,y=42,label="p=0.0063")
g<- g+ annotate("segment", x = 1.8, xend = 1.8, y = 40, yend = 41, colour = "black")
g<- g+ annotate("segment", x = 2.2, xend = 2.2, y = 40, yend = 41, colour = "black")
g<- g+ annotate("segment", x = 1.8, xend = 2.2, y = 41, yend = 41,colour = "black")
### Line for p-value 1uM vs 0uM at 24hr
g<-g+ annotate("text",x=3,y=47,label="p=0.0004")
g<- g+ annotate("segment", x = 2.8, xend = 2.8, y = 45, yend = 46,colour = "black")
g<- g+ annotate("segment", x = 3.2, xend = 3.2, y = 45, yend = 46, colour = "black")
g<- g+ annotate("segment", x = 2.8, xend = 3.2, y = 46, yend = 46,colour = "black")
### Line for p-value 1uM vs 0uM at 48hr
g<-g+ annotate("text",x=4,y=43,label="p=0.1670")
g<- g+ annotate("segment", x = 3.8, xend = 3.8, y = 41, yend = 42,colour = "black")
g<- g+ annotate("segment", x = 4.2, xend = 4.2, y = 41, yend = 42,colour = "black")
g<- g+ annotate("segment", x = 3.8, xend = 4.2, y = 42, yend = 42,colour = "black")
g

(sorry, SO won't let me upload a picture of my graph)

Community
  • 1
  • 1
MarinaWM
  • 79
  • 1
  • 4

1 Answers1

0

For me the crux of the question is in finding a workable/right height for the bar covering (i.e. over) two bars of the graph. All the other bits and pieces in the code of the OP is polish. Below a half-worked solution (it's nighty nighty for me shortly) but the basics are in there plus additional comments and suggestions for improvements.

Read the data from the clipboard and then compute a bar height from the available data per dose and time combination.

dat <- read.table("clipboard", sep="\t", header=TRUE)
library(plyr)
dat <- ddply(dat, .(Time), 
             function(d.f) {
               A <- subset(d.f, Dose==0)
               B <- subset(d.f, Dose==1)
               m <- max(A$Mean+A$Upper.SEM, B$Mean+B$Upper.SEM)
               d.f$bar.h <- round(m) + 5
               return(d.f)
             })

Set up a basic plot (forgetting about the polish)

library(ggplot2)
p <- ggplot(data=dat, mapping=aes(x=factor(Time), y=Mean, fill=factor(Dose))) +
  geom_bar(stat='identity', position='dodge') + 
  geom_errorbar(aes(ymin=Mean-Lower.SEM, ymax=Mean+Upper.SEM), 
                position=position_dodge(0.9), width=0.5)

## Instead of using annotate (which is just as fine), use geom_segment because
## the data is directly in the existing data.frame
p + geom_segment(mapping=aes(x=0.8, xend=1.2, y=bar.h, yend=bar.h))

Don't forget to add aes() to your call to geom_segment()!

Now when you run the code you will (in hindsight obviously) see three lines above the first pair of bars because I fixed the X-coordinates of the segment to the same value for all levels of Time.

To improve this, the next step that you need to do is add another set of columns to the data frame to specify the x-coordinates for each bar/segment and update the geom_segment() code accordingly. For simplicity's sake, presuming that you want to create multiple graphs with the same number of pairs of bars for each graph, it is easiest to do this manually.

Extrapolating from this presumption, do be aware of capitalization of chance when you are repeatedly making paired comparisons in a non-informed way (but this is just a guess on my end; you didn't specify where the p-values came from).

Paul Lemmens
  • 595
  • 5
  • 14