0

I was trying to create a simple line graph of means and interactions. I have a DV (reading times) on the y-axis, one factor (Length) on the x-axis, and another as a grouping variable (position).

The syntax I used is below. The data plotted as single points on a line for each of the two Length conditions, but did not connect with lines between the two Length conditions. What am I missing in terms of syntax?

I am using R i386 2.15.2, and updated ggplot2 last week.

Here is a reproducible example

SubjectID <- c(101,101,101,101,101,101,101,101,102,102,102,102,102,102,102,102,
        201,201,201,201,201,201,201,201,202,202,202,202,202,202,202,202)
Group <- c("PWA","PWA","PWA","PWA","PWA","PWA","PWA","PWA","PWA","PWA","PWA",
        "PWA","PWA","PWA","PWA","PWA","Control","Control","Control",
        "Control","Control","Control","Control","Control","Control",
        "Control","Control","Control","Control","Control","Control",
        "Control")
Length <- c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)
Pos <- c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2)
ReadT <- c(6.7,7.6,6.4,7.9,5.4,6.4,6.3,7.4,6.9,7.2,6.7,7.4,5.7,6.1,6.5,7.8,
        6.1,5.7,4.9,6.1,4.7,6.5,6.1,6.2,6.9,5.9,4.8,6.5,4.6,6.3,6.7,6.6)

data <- data.frame (SubjectID, Group,Length,Pos,ReadT)
data$Length <- factor(data$Length, order = TRUE,
        levels = c(1,2),
        labels = c("Length 1", "Length 2"))
data$Pos <- factor(data$Pos, order = TRUE,
        levels = c(1,2),
        labels = c("Position 1", "Position 2"))

qplot(Length, data=data, ReadT, geom=c("point", "line"), 
    stat="summary", fun.y=mean, group=Pos, colour=Pos, 
    facets = ~Group)
  • 6
    Please provide [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) - add sample data, preferably using dput() – Didzis Elferts Feb 26 '13 at 18:59

1 Answers1

1

I don't think you have reproduced any inconsistency, but your issues in part are clouded by trying condense everything into single qplot call.

Your x variable Length is a factor, therefore ggplot is sensibly considering Length 1 and Length 2 to be independent, and won't connect the lines.

Secondly, you won't be able to use stat_summary to summarize by your x values, without forcing these to be a factor (and hence independant).

I find it easiest to presummarize the data and not rely on ggplot.

eg

library(plyr)
data.means <- ddply(data, .(Group, Pos, Length), summarize, ReadT = mean(ReadT))

Then construct the plot using ggplot not qplot, to give you the flexibility (and transparency) required.

The trick to get the lines connected is to consider x numeric within the call to geom_line see here for example

ggplot(data.means, aes(x= Length, y= ReadT, colour = Pos)) + 
 geom_point() +
 geom_line(aes(x=as.numeric(Length))) +
 facet_grid(~Group)

If you insisted on using the raw data, and stat_xxxx functions, you could also replicate this using stat_smooth to estimate the means (which would keep x classified as numeric)

ggplot(data, aes(x = Length, y= ReadT, colour = Pos)) + 
 stat_summary(fun.y = 'mean', geom = 'point')+
 stat_smooth(method = 'lm', aes(x=as.numeric(Length)), se = FALSE) +
 facet_grid(~Group)
Community
  • 1
  • 1
mnel
  • 113,303
  • 27
  • 265
  • 254
  • You are absolutely correct, I have not been able to replicate the one time it worked - it will remain a mystery. I have edited out that part from the title and the question. Thank you for your very helpful comments! I also saw another post that addresses the same issue (which I have missed previously) that points to another package that might be helpful here - see: ggplot2: line connecting the means of grouped data – user2112401 Feb 27 '13 at 02:09
  • In the first ggplot code there need to be 3 parentheses mafter ..numeric(Length – user2112401 Feb 27 '13 at 02:46