I am on the lookout for the most elegant way to superimpose normal distribution fits in grouped histograms in ggplot2
. I know this question has been asked many times before, but none of the proposed options, like this one or this one struck me as very elegant, at least not unless stat_function
could be made to work on each particular subsection of the data.
One relatively elegant way to superimpose a normal distribution fit onto a non-grouped histogram that I did come across was using geom_smooth
and method="nls"
(aside from the fact then that it's not a self-starting function and that starting values have to be specified) :
library(ggplot2)
myhist = data.frame(size = 10:27, counts = c(1L, 3L, 5L, 6L, 9L, 14L, 13L, 23L, 31L, 40L, 42L, 22L, 14L, 7L, 4L, 2L, 2L, 1L) )
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() +
geom_smooth(method="nls", formula = y ~ N * dnorm(x, m, s), se=F,
start=list(m=20, s=5, N=300))
I was wondering though whether this approach could also be used to add normal distribution fits to grouped histograms as in
library(devtools)
install_github("tomwenseleers/easyGgplot2",type="source")
library("easyGgplot2") # load weight data
ggplot(weight,aes(x = weight)) +
+ geom_histogram(aes(y = ..count.., colour=sex, fill=sex),alpha=0.5,position="identity")
I was also wondering if there are any packages that might define a + stat_distrfit()
or + stat_normfit(
) for ggplot2
by any chance (with the possibility for grouping) ? (I couldn't really find anything, but this would seem like a common enough task, so I was just wondering)
Reason I want the code to be as short as possible is that this is for a course, and that I want to keep things as easy as possible...
PS geom_density
does not suit my goal and I would also like to plot the counts/frequencies as opposed to the density. I would also like to have them in the same panel, and avoid using facet_wrap