8

Is there a way to place horizontal lines with the group means on a plot without creating the summary data set ahead of time? I know this works, but I feel there must be a way to do this with just ggplot2.

library(dplyr)
library(ggplot2)
X <- data_frame(
  x = rep(1:5, 3),
  y = c(rnorm(5, 5, 0.5),
        rnorm(5, 3, 0.3),
        rnorm(5, 4, 0.7)),
  grp = rep(LETTERS[1:3], each = 5))

X.mean <- X %>%
  group_by(grp) %>%
  summarize(y = mean(y))

X %>%
  ggplot(aes(x = x, y = y, color = grp)) +
  geom_point(shape = 19) +
  geom_hline(data = X.mean, aes(group = grp, yintercept = y, color = grp)) +
  background_grid()

grouped mean lines

wdkrnls
  • 4,548
  • 7
  • 36
  • 64

1 Answers1

10

Expanding on my comment:

ggplot(X, aes(x = x, y = y, color = grp)) +
  geom_point(shape = 19) +
  stat_smooth(method="lm", formula=y~1, se=FALSE)+
  theme_bw()

So this applies a linear model with only the constant term, which returns the mean. Credit to this answer for the basic idea.

Edit: Response to OP's very clever suggestion.

It looks like you can use quantile regression to generate the medians!

library(quantreg)
ggplot(X, aes(x = x, y = y, color = grp)) +
  geom_point(shape = 19) +
  stat_smooth(method="rq", formula=y~1, se=FALSE)+
  theme_bw()

The basic requirement for stat_smooth(method=..., ...) is that the method returns an object for which there is a predict(...) method. So here rq(...) returns an rq object and there is a predict.rq(...) method. You can get into trouble using se=TRUE sometimes as not all predict methods return standard errors of the estimates.

Community
  • 1
  • 1
jlhoward
  • 58,004
  • 7
  • 97
  • 140