0

I am working on a plot where I am comparing values (y variable) for two groups (x variables) across numerous sites (facets). Using ggplot, I have been able to facet the plot based on the faceting variable (in this case site) and display my data, but have been unable to determine how to add a line segment to each group that indicates the median value for that group.

Reproducible example:

library(tidyverse)

df <- diamonds %>%
  filter(color == "D" | color == "E") %>%
  filter(carat > 1)

p <- ggplot(data=df, aes(x = color, y=carat, fill=color)) +
  geom_jitter(shape = 21, col="black") +
  facet_wrap(~ cut, ncol = 5)

p

Outputs:

Current output

What I would like to output would be something like the following (note, lines not actually drawn at medians):

Example of desired output

reedms
  • 155
  • 1
  • 12
  • 2
    Does this answer your question? [Display a summary line per facet rather than overall](https://stackoverflow.com/questions/50980134/display-a-summary-line-per-facet-rather-than-overall) – desval Jul 02 '20 at 18:22
  • Unfortunately, no. I'm looking for a median value for each color at each cut, where as this example would only display a single median across the cut. – reedms Jul 02 '20 at 18:34
  • Can you use the information in the answer to [that question](https://stackoverflow.com/questions/50980134/display-a-summary-line-per-facet-rather-than-overall) to figure out the answer to your own question? – Pranav Hosangadi Jul 02 '20 at 21:55

2 Answers2

4

Maybe this:

p + stat_summary(fun = "median", fun.min = "median", fun.max= "median", size= 0.3, geom = "crossbar")

See here ggplot2: add line for average per group

enter image description here

desval
  • 2,345
  • 2
  • 16
  • 23
2

You could do the following: Create a seperate dataframe, where you summarise by group and compute the median within each group. After that you can add geom_hline() to your plot with yintercept aesthetic correctly specified:

library(tidyverse)

df <- diamonds %>%
  filter(color == "D" | color == "E") %>%
  filter(carat > 1)

df_median <- df %>% group_by(cut) %>%
  summarise(median_carat = median(carat))


p <- ggplot(data=df, aes(x = color, y=carat, fill=color)) +
  geom_jitter(shape = 21, col="black") +
  geom_hline(data = df_median, aes(yintercept = median_carat), size = 2, color = "red")+
  facet_wrap(~ cut, ncol = 5)

enter image description here

mabreitling
  • 608
  • 4
  • 6
  • In this example, I would like a median line for each color at each cut and not a median across both colors with the same cut. – reedms Jul 02 '20 at 18:32