28

I have a small table of summary data with the odds ratio, upper and lower confidence limits for four categories, with six levels within each category. I'd like to produce a chart using ggplot2 that looks similar to the usual one created when you specify a lm and it's se, but I'd like R just to use the pre-specified values I have in my table. I've managed to create the line graph with error bars, but these overlap and make it unclear. The data look like this:

interval    OR  Drug    lower   upper
14  0.004   a   0.002   0.205
30  0.022   a   0.001   0.101
60  0.13    a   0.061   0.23
90  0.22    a   0.14    0.34
180 0.25    a   0.17    0.35
365 0.31    a   0.23    0.41
14  0.84    b   0.59    1.19
30  0.85    b   0.66    1.084
60  0.94    b   0.75    1.17
90  0.83    b   0.68    1.01
180 1.28    b   1.09    1.51
365 1.58    b   1.38    1.82
14  1.9 c   0.9 4.27
30  2.91    c   1.47    6.29
60  2.57    c   1.52    4.55
90  2.05    c   1.31    3.27
180 2.422   c   1.596   3.769
365 2.83    c   1.93    4.26
14  0.29    d   0.04    1.18
30  0.09    d   0.01    0.29
60  0.39    d   0.17    0.82
90  0.39    d   0.2 0.7
180 0.37    d   0.22    0.59
365 0.34    d   0.21    0.53

I have tried this:

limits <- aes(ymax=upper, ymin=lower)
dodge <- position_dodge(width=0.9)
ggplot(data, aes(y=OR, x=days, colour=Drug)) + 
  geom_line(stat="identity") + 
  geom_errorbar(limits, position=dodge)

and searched for a suitable answer to create a pretty plot, but I'm flummoxed!

Any help greatly appreciated!

tonytonov
  • 25,060
  • 16
  • 82
  • 98
user4575913
  • 507
  • 1
  • 7
  • 16
  • Sorry the data came out all jumbled - its supposed to be 5 columns; interval, OR, Drug, lower an upper. – user4575913 Apr 20 '15 at 09:06
  • Welcome to SO! First, you can see what I edited to make the data and the code look right. Second, you probably mean `x=interval` instead of `x=days` since there's no `days` in your data. Third, it would be nice to give an example of the desired plot (just add a link to it and someone with enough rep will embed it). – tonytonov Apr 20 '15 at 10:36
  • Look at `geom_ribbon` – James Apr 20 '15 at 10:53
  • The variable "days" cannot be found in your data, do you mean interval here? – Ruthger Righart Apr 20 '15 at 11:26
  • Thanks all, yes I did mean interval - oops! – user4575913 Apr 20 '15 at 14:42

2 Answers2

55

You need the following lines:

p<-ggplot(data=data, aes(x=interval, y=OR, colour=Drug)) + geom_point() + geom_line()
p<-p+geom_ribbon(aes(ymin=data$lower, ymax=data$upper), linetype=2, alpha=0.1)

enter image description here

Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33
3

Here is a base R approach using polygon() since @jmb requested a solution in the comments. Note that I have to define two sets of x-values and associated y values for the polygon to plot. It works by plotting the outer perimeter of the polygon. I define plot type = 'n' and use points() separately to get the points on top of the polygon. My personal preference is the ggplot solutions above when possible since polygon() is pretty clunky.

library(tidyverse)

data('mtcars')  #built in dataset

mean.mpg = mtcars %>% 
  group_by(cyl) %>% 
  summarise(N = n(),
        avg.mpg = mean(mpg),
        SE.low = avg.mpg - (sd(mpg)/sqrt(N)),
        SE.high =avg.mpg + (sd(mpg)/sqrt(N)))


plot(avg.mpg ~ cyl, data = mean.mpg, ylim = c(10,30), type = 'n')

#note I have defined c(x1, x2) and c(y1, y2)
polygon(c(mean.mpg$cyl, rev(mean.mpg$cyl)), 
c(mean.mpg$SE.low,rev(mean.mpg$SE.high)), density = 200, col ='grey90')

points(avg.mpg ~ cyl, data = mean.mpg, pch = 19, col = 'firebrick')
Kodiakflds
  • 603
  • 1
  • 4
  • 15