0

I have a data frame mydataAll with columns DESWC, journal, and highlight. To calculate the average and standard deviation of DESWC for each journal, I do

avg <- aggregate(DESWC ~ journal, data = mydataAll, mean)
stddev <- aggregate(DESWC ~ journal, data = mydataAll, sd)

Now I plot a horizontal stripchart with the values of DESWC along the x-axis and each journal along the y-axis. But for each journal, I want to indicate the standard deviation and average with a simple line. Here is my current code and the results.

stripchart2 <- 
  ggplot(data=mydataAll, aes(x=mydataAll$DESWC, y=mydataAll$journal, color=highlight)) +
  geom_segment(aes(x=avg[1,2] - stddev[1,2], 
               y = avg[1,1], 
               xend=avg[1,2] + stddev[1,2], 
               yend = avg[1,1]), color="gray78") +
  geom_segment(aes(x=avg[2,2] - stddev[2,2], 
               y = avg[2,1], 
               xend=avg[2,2] + stddev[2,2], 
               yend = avg[2,1]), color="gray78") +
  geom_segment(aes(x=avg[3,2] - stddev[3,2], 
               y = avg[3,1], 
               xend=avg[3,2] + stddev[3,2], 
               yend = avg[3,1]), color="gray78") +
  geom_point(size=3, aes(alpha=highlight)) + 
  scale_x_continuous(limit=x_axis_range) +
  scale_y_discrete(limits=mydataAll$journal) +
  scale_alpha_discrete(range = c(1.0, 0.5), guide='none') 
show(stripchart2)

enter image description here

See the three horizontal geom_segments at the bottom of the image indicating the spread? I want to do that for all journals, but without handcrafting each one. I tried using the solution from this question, but when I put everything in a loop and remove the aes(), it give me an error that says:

Error in x - from[1] : non-numeric argument to binary operator

Can anyone help me condense the geom_segment() statements?

Community
  • 1
  • 1
tn3rt
  • 260
  • 3
  • 13
  • Please add a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to your question. It looks like you need to get `avg`, `stddev`, and `journal` into a single dataset for plotting (maybe merge?). Then you could add all the segments in one `geom_segment` call. – aosmith May 12 '16 at 18:39

1 Answers1

2

I generated some dummy data to demonstrate. First, we use aggregate like you have done, then we combine those results to create a data.frame in which we create upper and lower columns. Then, we pass these to the geom_segment specifying our new dataset. Also, I specify x as the character variable and y as the numeric variable, and then use coord_flip():

library(ggplot2)
set.seed(123)
df <- data.frame(lets = sample(letters[1:8], 100, replace = T),
                 vals = rnorm(100), 
                 stringsAsFactors = F)
means <- aggregate(vals~lets, data = df, FUN = mean)
sds <- aggregate(vals~lets, data = df, FUN = sd)
df2 <- data.frame(means, sds)
df2$upper = df2$vals + df2$vals.1
df2$lower = df2$vals - df2$vals.1

ggplot(df, aes(x = lets, y = vals))+geom_point()+
  geom_segment(data = df2, aes(x = lets, xend = lets, y = lower, yend = upper))+
  coord_flip()+theme_bw()

enter image description here Here, the lets column would resemble your character variable.

bouncyball
  • 10,631
  • 19
  • 31
  • Yep, I tested it using a factor to color the points, you may want to have the `aes(colour = foo)` call within the `geom_point` call. – bouncyball May 12 '16 at 19:08