I am using R to create size frequency histograms for diseased and healthy individuals with fitted normal distribution lines. I have 2 issues that I'm seeking advice on.
- How do I create a histogram from aggregated data? The example table below has the summarized number of diseased and healthy individuals within each size.
dput(data)
'structure(list(Size = c(25L, 28L, 31L, 45L, 60L), diseased = c(0L,
22L, 10L, 5L, 2L), healthy = c(55L, 40L, 15L, 7L, 2L)), .Names = c("Size",
"diseased", "healthy"), class = "data.frame", row.names = c(NA,
-5L))'
2.How do I overlay both histograms into 1 figure with fitted normal distribution lines.
I have tried the following code for aggregated data ggplot(data,aes(x=Size,y=diseased))+geom_bar(stat='identity'), which works well, but I can't figure out how to add the histogram for the healthy individuals.
I have also tried using the following text to revert the summarized data (called "data") to the original raw format: raw <- data[rep(1:data, times=data$diseased), "Size", drop=FALSE]
I get the following error message: Error in rep(1:data, times=data$diseased) : invalid 'times' argument. From previous comments, it appears that the rep function can't handle "0"