I began to have this question while reading this book "R for Data Science" by Hadley Wickham. In the chapter of "Data Transformation", the author used this example:
library(nycflights13)
library(tidyverse)
by_dest <- group_by(flights, dest)
delay <- summarise(by_dest,
count = n(),
dist = mean(distance, na.rm = TRUE),
delay = mean(arr_delay, na.rm = TRUE)
)
#> `summarise()` ungrouping output (override with `.groups` argument)
delay <- filter(delay, count > 20, dest != "HNL")
# It looks like delays increase with distance up to ~750 miles
# and then decrease. Maybe as flights get longer there's more
# ability to make up delays in the air?
ggplot(data = delay, mapping = aes(x = dist, y = delay)) +
geom_point(aes(size = count), alpha = 1/3) +
geom_smooth(se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
The output plot looks like as follows: relationship between delay and flight distance
So I wonder if we can know the extremum of the curve and the corresponding x value using R?
From this answer, I've figured using stat_poly_eq() from ggpmisc to calculate the polynomial regression equation:
library(ggpmisc)
formula=y ~ poly(x, 3, raw=TRUE)
p <- ggplot(data = delay, mapping = aes(x = dist, y = delay))
p <- p + geom_point(aes(size = count), alpha = 1/3)
p <- p+ geom_smooth(method = "lm", formula = formula, se = FALSE)
(p1 <- p+ stat_poly_eq(formula = formula, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), parse = TRUE))
And use that equation to find out the extremum of the curve, which I think it's very not clever. So I wonder how can I figure out the extremum and the corresponding x value of my regression curve (not the max and min value of raw data but the max and the min value of the regression curve).