We don't have access to the Delays_weather0
dataset. Hence, I'll use the 1st 100 observations on dep_delay
of the flights
dataset, provided in the nycflights13
package.
Since the histogram in R by default plots the frequency, I'll multiply the probabilities by the number of observations, i.e. 1000 to make the two graph comparable.
I'm using the lines
function at first.
library(nycflights13)
dataset <- flights$dep_delay[1:1000]
hist(x = dataset,
breaks = 10,
col = "red",
xlab = "Delays",
main = "Flight Delays")
range_dataset <- range(dataset,
na.rm = TRUE)
equidistant_points_in_range <- seq(from = range_dataset[1],
to = range_dataset[2],
length.out = length(x = dataset))
upper_cdf_probabilities <- pnorm(q = equidistant_points_in_range,
mean = mean(x = dataset,
na.rm = TRUE),
sd = sd(x = dataset,
na.rm = TRUE),
lower.tail = FALSE)
lines(x = length(x = dataset) * upper_cdf_probabilities,
col = "blue")

Created on 2019-03-17 by the reprex package (v0.2.1)
Another way using the curve function.
dataset <- nycflights13::flights$dep_delay[1:1000]
range_dataset <- range(dataset,
na.rm = TRUE)
upper_tail_probability <- function(x)
{
pnorm(q = x,
mean = mean(x = dataset,
na.rm = TRUE),
sd = sd(x = dataset,
na.rm = TRUE),
lower.tail = FALSE)
}
vectorized_upper_tail_probability <- Vectorize(FUN = upper_tail_probability)
hist(x = dataset,
freq = FALSE,
col = "red",
xlab = "Delays",
main = "Flight Delays")
curve(expr = vectorized_upper_tail_probability,
from = range_dataset[1],
to = range_dataset[2],
n = 1000,
add = TRUE,
col = "blue")

Created on 2019-03-17 by the reprex package (v0.2.1)