3

I am using Rstudio and ggplot2. I want to create a series of frequency histograms with the normal curve of density overlaid. I managed to create the code for a single variable, using the binwidth function suggested on tidyverse.

This is the code I managed to create for a single variable:

ggplot(mtcars, aes(x = mpg)) + 
  geom_histogram(binwidth = function(x)  (max(x)-min(x))/nclass.FD(x)) + 
  stat_function(
    fun = function(x, mean, sd, n, bw){
      dnorm(x = x, mean = mean, sd = sd) * n * bw
    }, 
    args = with(mtcars, c(mean = mean(mpg), sd = sd(mpg), n
                          = length(mpg), bw = (max(mpg)-min(mpg))/nclass.FD(mpg)))
  ) + 
  scale_x_continuous("Miles per gallon") +
  theme_base() +
  ggtitle("Histogram with Normal Curve")

Now I would like to create the same graph for all the variables in my dataframe using the dataframe in the long format and the facet_wrap() function, as in the last example of the same link above on tidyverse. Clearly I have an issue with the normal curve / stat_function().

This is the code I am using and the figure attached is the output:

ggplot(mtlong, aes(value)) + facet_wrap(~variable, scales = 'free_x') +
  geom_histogram(binwidth = function(x) (max(x)-min(x))/nclass.FD(x)) +
  stat_function(
    fun = function(x, mean, sd, n, bw){
      dnorm(x = x, mean = mean, sd = sd) * n * bw
    }, 
    args = with(mtlong, c(mean = mean(value), sd = sd(value), n
                          = length(value), bw = (max(value)-min(value))/nclass.FD(value)))) +
  theme_base() +
  ggtitle("Histogram with Normal Curve")

enter image description here

Any suggestion on how to adjust the normal curve for each variable?

Many thanks! CP

Calum You
  • 14,687
  • 4
  • 23
  • 42

0 Answers0