I'm trying to plot a facetted box plot of some data, including the outliers with colours, a mark showing group mean (created with stat_summary
), as well as a horizontal line marking the average value of the facet group. I would like to have a legend for the horizontal line (denoting the facet group average) and the box plot mark (denoting the group means within each facet). I've tried quite a few approaches thus far, from testing out the show.legend
argument in every geom, fiddling with different inputs to scale_colour_manual
(e.g. this or this SO post), the new_scale_color
function of the ggnewscale
package, or overriding the aesthetics with the guides
function.
Below is the plot so far, along with the code to make it (using the msleep
dataset). I've successfully been able to get the desired legends of the two stats (geom_hline
and stat_summary
- marked in green). However, this comes along with another legend of the variables I'm faceting on (marked in red), and I cannot figure out how to hide only this part of the legend, but keep the ones I'm interested in.
library(tidyverse)
library(ggsci)
df <- as_tibble(msleep) %>%
mutate(bodywt_log10 = log10(bodywt)) %>%
pivot_longer(cols = c(sleep_total, awake, bodywt_log10),
names_to = 'stat', values_to = 'value') %>%
select(-c('sleep_rem', 'sleep_cycle', 'brainwt', 'bodywt')) %>%
mutate(stat = factor(stat, levels = unique(stat))) %>%
group_by(stat) %>%
mutate(mean = mean(value)) %>%
group_by(stat, order) %>%
mutate(Q1 = quantile(value, 1/4, na.rm = T),
Q2 = quantile(value, 2/4, na.rm = T),
Q3 = quantile(value, 3/4, na.rm = T),
IQR = IQR(value, na.rm = T),
upper.limit=Q3+1.5*IQR,
lower.limit=Q1-1.5*IQR)
outlier_dat <- df %>%
filter(value > upper.limit | value < lower.limit)
pal <- pal_npg('nrc')(3)
df %>%
ggplot(aes(x = order, y = value)) +
geom_boxplot(show.legend = F, outlier.shape = NA, aes(col = stat)) +
geom_boxplot(outlier.shape = NA, show.legend = F, aes(fill = stat)) +
stat_summary(fun = mean, geom = 'point', shape = 18, size = 3, aes(col = 'group_mean'), show.legend = T) +
facet_grid(stat~., scales = 'free', labeller = as_labeller(
c(sleep_total = 'Total sleep', awake = 'Time awake', bodywt_log10 = 'log10(Bodyweight)'))) +
geom_hline(aes(yintercept = mean, col = 'mean'), size = .75, lty = 2, show.legend = T) +
scale_fill_npg() +
scale_colour_manual(name = NULL, values = c('mean' = 'black', 'group_mean' = 'gray10',
'sleep_total' = pal[1], 'awake' = pal[2], 'bodywt_log10' = pal[3]),
breaks = c('mean', 'group_mean'), labels = c('Overall mean', 'Group mean'),
guide = guide_legend(override.aes = list(linetype = c('dashed', 'blank'), shape = c(NA, 18)))) +
geom_point(data = outlier_dat, aes(col = stat), size = 2, show.legend = F)