1

I am using 'flights' data set from 'nycflights13' package and 'ggplot2' package to convert the code using stat_summary function into the one using geom_ribbon(), geom_line(), and geom_point() functions. Here is the original code:

      flights %>% select(hour, dep_delay, arr_delay) %>% filter(hour > 4) %>%
  pivot_longer(!hour) %>%
  ggplot() +
  stat_summary(aes(hour, value, color = name),
               fun = mean,
               geom = "point",
               size = 3) +
  stat_summary(aes(hour, value, color = name),
               fun = mean,
               geom = "line",
               size = 1.1) +
  stat_summary(aes(hour, value, color = name),
               fun.data = "mean_sdl",
               fun.args = list(mult = 0.2),
               geom = "ribbon",
               alpha = 0.3) +
  theme_bw()

Below is my code:

 df = flights %>%  
  select(hour, dep_delay, arr_delay) %>% filter(hour > 4) %>%
  pivot_longer(!hour) %>% group_by(hour,name) %>%
  summarise(value = mean(value, na.rm = T))
df %>% mutate(low = value - sd(value)*(0.2), high = value + sd(value)*(0.2)) %>% ggplot() +
  geom_point(aes(hour, value, color = name), size = 3) +
  geom_line(aes(hour, value, color = name), size = 1.1) +
  geom_ribbon(aes(x = hour, ymax = high, ymin = low), alpha = 0.3)
  theme_bw()

However, the plot I made is not similar to the orginal one, I know the problem lies in the geom_ribbon() part but I don't know how to fix it. Could anyone help me? Thank you so much!

2 Answers2

2
library(nycflights13)
library(tidyverse)
f <- flights %>% 
   select(hour, dep_delay, arr_delay) %>% 
   filter(hour > 4) %>%
   pivot_longer(!hour)

Replicate the calculation that stat_summary() does internally, applying the mean_sdl function to each hour/name combination:

fs <- (f
  ## partition data
  %>% group_by(hour, name)
  ## convert value to a list-column
  %>% nest()
  ## summarise each entry
  %>% mutate(across(data, map, \(x) mean_sdl(x, mult = 0.2)))
  ## collapse back to a vector
  %>% unnest(cols = c(data))
)

Now create the plot:

ggplot(fs) +
  aes(hour, y = y, ymin = ymin, ymax = ymax, color = name) +
  geom_point(size = 3) +
  geom_line(size = 1.1) +
  geom_ribbon(alpha = 0.3) + 
  theme_bw()

The order of the elements affects the colours of the lines — i.e. if geom_ribbon is last, it covers the lines with one or two layers of "black/alpha=0.3" (depending on whether the lines are overlapped by one or both confidence regions). I might recommend drawing the lines and points after you draw the ribbon, so that the colours are closer to the originally specified values/more predictable (but there's no need to do that if you like the way your plot looks).

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
1

You need to add name as a grouping variable. The natural way to do this is to map it to the color aesthetic:

 flights %>%  
  select(hour, dep_delay, arr_delay) %>% 
  filter(hour > 4) %>%
  pivot_longer(!hour) %>% 
  group_by(hour, name) %>%
  summarise(mean = mean(value, na.rm = T), 
            high = mean(value, na.rm = T) + 0.2 * sd(value, na.rm= T),
            low  = mean(value, na.rm = T) - 0.2 * sd(value, na.rm= T)) %>%
  ggplot() +
  geom_point(aes(hour, mean, color = name), size = 3) +
  geom_line(aes(hour, mean, color = name), size = 1.1) +
  geom_ribbon(aes(x = hour, ymax = high, ymin = low, color = name), alpha = 0.3) +
  theme_bw()

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • This doesn't look like the OP's original plot (I think the `sd()` value is not being applied at the right level). My plot is closer but the colours don't match, not sure why ... – Ben Bolker Jun 12 '22 at 18:44
  • No, I assumed the OP was attempting to modify the original by multiplying the sd by 0.2 - perhaps not. – Allan Cameron Jun 12 '22 at 18:46
  • @BenBolker on re-reading it, I think you are right. I think the output is identical now. – Allan Cameron Jun 12 '22 at 19:01
  • the "thank you" is appreciated, but thank-you comments are actually deprecated on Stack Overflow. You can up-vote any answers that were useful, and pick the answer that best answers your question and click on the check-mark to accept it. – Ben Bolker Jun 12 '22 at 19:11