1

this is my first stack overflow post and I am a relatively new R user, so please go gently!

I have a data frame with three columns, a participant identifier, a condition (factor with 2 levels either Placebo or Experimental), and an outcome score.

set.seed(1)
dat <- data.frame(Condition = c(rep("Placebo",10),rep("Experimental",10)), 
                  Outcome = rnorm(20,15,2), 
                  ID = factor(rep(1:10,2)))

I would like to construct a bar plot with two bars with the mean outcome score for each condition and the standard deviation as an error bar. I would like to then overlay lines connecting points for each participant's score in each condition. So the plot displays the individual response as well as the group mean.If it is also possible I would like to include an axis break.

I don't seem to be able to find any advice in other threads, apologies if I am repeating a question.

Many Thanks.

p.s. I realise that presenting data in this way will not be to everyones tastes. It is for a specific requirement!

rawr
  • 20,481
  • 4
  • 44
  • 78
sp202
  • 25
  • 3
  • 2
    I would guess providing some example data set and a link to a similar picture to your desired output will get you started – David Arenburg Dec 08 '14 at 20:50
  • 1
    [This question](http://stackoverflow.com/q/5963269/903061) has good advice for creating nice questions. `dput()` is nice for sharing data, and if you find a suitable built-in dataset, do try to create a nice *minimal* example that has the necessary features. – Gregor Thomas Dec 08 '14 at 21:00
  • See [Plotting means and error bars (ggplot2)](http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/) – zx8754 Dec 08 '14 at 21:04
  • Thanks for the advice. Is the below a suitable example of data: dat <- data.frame(Condition = c(rep("Placebo",10),rep("Experimental",10)), Outcome = rnorm(20,15,2), ID = factor(rep(1:10,2))) – sp202 Dec 08 '14 at 21:06
  • 1
    In general, with code, you should edit things into your question rather than leaving them in comments. – Gregor Thomas Dec 08 '14 at 21:17
  • Thanks for all the help everyone. I have managed to create the desired plot. – sp202 Dec 09 '14 at 08:15

3 Answers3

3

This ought to work:

library(ggplot2)
library(dplyr)

dat.summ <- dat %>% group_by(Condition) %>%
  summarize(mean.outcome = mean(Outcome),
            sd.outcome = sd(Outcome))

ggplot(dat.summ, aes(x = Condition, y = mean.outcome)) +
  geom_bar(stat = "identity") +
  geom_errorbar(aes(ymin = mean.outcome - sd.outcome,
                    ymax = mean.outcome + sd.outcome),
                color = "dodgerblue", width = 0.3) +
  geom_point(data = dat, aes(x = Condition, y = Outcome),
            color = "firebrick", size = 1.2) +
  geom_line(data = dat, aes(x = Condition, y = Outcome, group = ID),
            color = "firebrick", size = 1.2, alpha = 0.5) +
  scale_y_continuous(limits = c(0, max(dat$Outcome)))

enter image description here

Some people are better with ggplot's stat functions and arguments than I am and might do it differently. I prefer to just transform my data first.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
3
set.seed(1)
dat <- data.frame(Condition = c(rep("Placebo",10),rep("Experimental",10)), 
                  Outcome = rnorm(20,15,2), 
                  ID = factor(rep(1:10,2)))

dat.w <- reshape(dat, direction = 'wide', idvar = 'ID', timevar = 'Condition')

means <- colMeans(dat.w[, 2:3])
sds <- apply(dat.w[, 2:3], 2, sd)
ci.l <- means - sds
ci.u <- means + sds
ci.width <- .25

bp  <- barplot(means, ylim = c(0,20))
segments(bp, ci.l, bp, ci.u)
segments(bp - ci.width, ci.u, bp + ci.width, ci.u)
segments(bp - ci.width, ci.l, bp + ci.width, ci.l)
segments(x0 = bp[1], x1 = bp[2], y0 = dat.w[, 2], y1 = dat.w[, 3], col = 1:10)
points(c(rep(bp[1], 10), rep(bp[2], 10)), dat$Outcome, col = 1:10, pch = 19)

enter image description here

rawr
  • 20,481
  • 4
  • 44
  • 78
2

Here is a method using the transfomations inside ggplot2

ggplot(dat) + 
stat_summary(aes(x=Condition, y=Outcome, group=Condition), fun.y="mean", geom="bar") + 
stat_summary(aes(x=Condition, y=Outcome, group=Condition), fun.data="mean_se", geom="errorbar", col="green", width=.8, size=2) + 
  geom_line(aes(x=Condition, y=Outcome, group=ID), col="red")

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295