3

In ggplot2 I'm attempting a simple thing that I just can't get for some reason. I have adjusted means and SE in a data frame and want to plot the means, error bars and then connect the means with points. Here's the code and the error (it does everything except connect the means with geom_line (working with RCookbook:

library(ggplot2)
#data set
data1 <- structure(list(group = structure(1:3, .Label = c("1", "2", "3"
), class = "factor"), estimate = c(55.7466654122763, 65.0480954172939, 
61.9552391704298), SE = c(2.33944612149257, 2.33243565412438, 
2.33754952927041), t.ratio = c(23.8290016171476, 27.8884844271143, 
26.5043535525714)), .Names = c("group", "estimate", "SE", "t.ratio"
), row.names = c(NA, 3L), class = "data.frame")

#the attempted plot
pd <- position_dodge(.1)
ggplot(data1, aes(x=group, y=estimate, group=group)) + 
    geom_errorbar(aes(ymin=estimate-SE, ymax=estimate+SE), 
        colour="black", width=.1, position=pd) +
    geom_line(data=data1, aes(x=group, y=estimate)) + 
    geom_point(position=pd, size=4)

the error:

ymax not defined: adjusting position using y instead
geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?
mnel
  • 113,303
  • 27
  • 265
  • 254
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519

4 Answers4

6

If you remove the grouping by group within the ggplot call and set x = as.numeric(group ) within the call to geom_line, the it works.

Also, you don't need to re-reference data1 within geom_line

ggplot(data1, aes(x=group, y=estimate)) + 
  geom_errorbar(aes(ymin=estimate-SE, ymax=estimate+SE), 
  colour="black", width=.1, position=pd) +
  geom_line( aes(x=as.numeric(group), y=estimate)) + 
  geom_point(position=pd, size=4)

enter image description here

If you group by group, then you only have one value for geom_line to create a line from, hence the error message. The same error occurs if ggplot is considering the x or y mapping variables as a factor - this is because if you code a variable as a factor R (and ggplot) will consider them independent groups, and not connect the points - this is sensible default behaiviour.

EDIT - with alphabetic factor labels

This will work with alphabetic factor labels due to the way factors are coded internally by R (ie as.numeric(factor) returns numbers not the factor labels)

ie.e

Changing group to a, b, c

levels(data1[['group']]) <- letters[1:3] 
ggplot(data1, aes(x=group, y=estimate)) + 
  geom_errorbar(aes(ymin=estimate-SE, ymax=estimate+SE), 
  colour="black", width=.1, position=pd) +
  geom_line( aes(x=as.numeric(group), y=estimate)) + 
  geom_point(position=pd, size=4)

enter image description here

mnel
  • 113,303
  • 27
  • 265
  • 254
  • Thank you mnel. The groups in this case happen to work out but what if they were group A, B and C instead? Also I knew I didn't have to reapply the data set in geom_line but desperate measures called for insane thinking :) – Tyler Rinker Sep 20 '12 at 01:15
  • Could some one explain the rationale for why I need to convert to numeric or use the group = "all" trick? – Tyler Rinker Sep 20 '12 at 01:17
  • 1
    And I've further explained why as.numeric is required (or more so why factor axes don't work here) – mnel Sep 20 '12 at 01:21
2

As an alternative to mnel's answer, you could create a new variable, so that you have a column where all 3 groups have the same value:

 data1$all <- "All"

And then use that as the group aesthetic for your line:

ggplot(data1, aes(x=group, y=estimate)) + 
    geom_errorbar(aes(ymin=estimate-SE, ymax=estimate+SE), 
        colour="black", width=.1, position=pd) +
    geom_line(aes(x=group, y=estimate, group=all)) + 
    geom_point(position=pd, size=4)

Mnel's answer is probably more elegant, but this might work better if the groups aren't numbers and can't be converted to numeric so straightforwardly.

Marius
  • 58,213
  • 16
  • 107
  • 105
  • 1
    The `as.numeric` trick should work with any factor due to the way `R` deals with factors (integer with label) – mnel Sep 20 '12 at 01:14
  • I prefer this grouping method when the dataframe already contains a value like "ALL". +1 for simplicity here. – blehman Jan 05 '14 at 05:10
1

You might also look at the 2nd answer to this SO Question If you are working toward a fuller implementation this might save you some time.

Community
  • 1
  • 1
Bryan Hanson
  • 6,055
  • 4
  • 41
  • 78
0

Here is a less verbose approach to connect a categorical/factor column with a line:

levels(data1[['group']]) <- letters[1:3] 

ggplot(data1, aes(x = group, y = estimate)) + 
      geom_line(aes(group = 1)) + 
      geom_pointrange(aes(ymin = estimate-SE, ymax = estimate+SE))

enter image description here

My guess is that this is made a bit obscure and hard to find in ggplot because it is generally not advisable to connect non-continuous spaces with a line.

If you want to have lines also grouped by a color aesthetic you need to use interaction:

data2 <- bind_rows(data1, list('group' = 'b', 'estimate' = 67, 'SE' = 2.2, 't.ratio' = 27))
data2$group[3] <- 'a'
data2$color_group <- c('one', 'one', 'two', 'two')
data2
# data2 dataframe
  group estimate       SE  t.ratio color_group
1     a 55.74667 2.339446 23.82900         one
2     b 65.04810 2.332436 27.88848         one
3     a 61.95524 2.337550 26.50435         two
4     b 67.00000 2.200000 27.00000         two
ggplot(data2, aes(x = group, y = estimate, color = color_group)) + 
      geom_line(aes(group = interaction(1, color_group))) + 
      geom_pointrange(aes(ymin = estimate-SE, ymax = estimate+SE))

enter image description here

joelostblom
  • 43,590
  • 17
  • 150
  • 159