8

I have a dataset that looks a little like this:

a <- data.frame(x=rep(c(1,2,3,5,7,10,15,20), 5),
                y=rnorm(40, sd=2) + rep(c(4,3.5,3,2.5,2,1.5,1,0.5), 5))
ggplot(a, aes(x=x,y=y)) + geom_point() +geom_smooth()

graph output

I want the same output as that plot, but instead of smooth curve, I just want to take line segments between the mean/sd values for each set of x values. The graph should look similar to the above graph, but jagged, instead of curved.

I tried this, but it fails, even though the x values aren't unique:

ggplot(a, aes(x=x,y=y)) + geom_point() +stat_smooth(aes(group=x, y=y, x=x))
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
naught101
  • 18,687
  • 19
  • 90
  • 138

4 Answers4

8

?stat_summary is what you should look at.

Here is an example

# functions to calculate the upper and lower CI bounds
uci <- function(y,.alpha){mean(y) + qnorm(abs(.alpha)/2) * sd(y)}
lci <- function(y,.alpha){mean(y) - qnorm(abs(.alpha)/2) * sd(y)}
ggplot(a, aes(x=x,y=y))  + stat_summary(fun.y = mean, geom = 'line', colour = 'blue') + 
            stat_summary(fun.y = mean, geom = 'ribbon',fun.ymax = uci, fun.ymin = lci, .alpha = 0.05, alpha = 0.5)

enter image description here

mnel
  • 113,303
  • 27
  • 265
  • 254
4

You can use one of the built-in summary functions mean_sdl. The code is shown below

ggplot(a, aes(x=x,y=y)) + 
 stat_summary(fun.y = 'mean', colour = 'blue', geom = 'line')
 stat_summary(fun.data = 'mean_sdl', geom = 'ribbon', alpha = 0.2)
Ramnath
  • 54,439
  • 16
  • 125
  • 152
4

Using ggplot2 0.9.3.1, the following did the trick for me:

ggplot(a, aes(x=x,y=y)) + geom_point() +
 stat_summary(fun.data = 'mean_sdl', mult = 1, geom = 'smooth')

The 'mean_sdl' is an implementation of the Hmisc package's function 'smean.sdl' and the mult-variable gives how many standard deviations (above and below the mean) are displayed.

For detailed info on the original function:

library('Hmisc')
?smean.sdl
dlaehnemann
  • 671
  • 5
  • 17
3

You could try writing a summary function as suggested by Hadley Wickham on the website for ggplot2: http://had.co.nz/ggplot2/stat_summary.html. Applying his suggestion to your code:

p <- qplot(x, y, data=a)

stat_sum_df <- function(fun, geom="crossbar", ...) { 
 stat_summary(fun.data=fun, colour="blue", geom=geom, width=0.2, ...) 
} 

p + stat_sum_df("mean_cl_normal", geom = "smooth") 

This results in this graphic:

enter image description here

smillig
  • 5,073
  • 6
  • 36
  • 46
  • Nice. But I don't really understand why you're wrapping it in that function. Why not just use `p + stat_summary("mean_cl_normal", geom = "smooth", colour="blue", width=0.2)`? Also, what is the `width=0.2` for? Doesn't seem to make much difference to the output... – naught101 Aug 21 '12 at 00:36
  • The function uses the summary functions from the `Hmisc` package. I thought the `width=0.2` would change the width of the line, but it doesn't seem to do anything as you say, so it apparently has no function! – smillig Aug 21 '12 at 05:11
  • `size` will change the width of the line. – Gregor Thomas May 05 '13 at 15:31