0

so I've been attempting to put my KDE function of my histogram and Ive managed to do so, however when i try to scale the sec.axis = sec_axis(~./#number) i cant seem to match it t my histogram is there a way i can make it automatically choose which number it should display to get the best match.

the code im using is

a <- ggplot(birds, aes(birds$`Log10(Total Average)`))+
  geom_histogram(col = 'black', fill = 'white', binwidth = 0.2)+
  labs(x = 'Log10 total body mass (kg)', y = 'Frequency', title = 'Average total body mass (kg) of bird species (male adn female) in KNP')
a + geom_density(aes(y=..count..), col=2, size=1)+
  scale_y_continuous(sec.axis = sec_axis(~./40, name = "Density"))

enter image description here

markus
  • 25,843
  • 5
  • 39
  • 58
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 18 '20 at 21:42

1 Answers1

0

Like in your previous post, you did not provide a reproducible example of your dataset... so it is hard to provide a solution that will work for your data.

One way is to compute the max count per bins and the maximal value of the density function, then you can make a ratio and apply it to the geom_density

set.seed(123)
df <- data.frame(Total_average = rnorm(100,0,2))
binwidth = 0.2

Seq <- seq(floor(min(df$Total_average)), ceiling(max(df$Total_average)), by = binwidth)

# Determine max count per bins
Max_HIST <- max(hist(df$Total_average, breaks = Seq)$counts)

# Determine the max of the density
Max_Dens <- max(density(df$Total_average)$y)

Ratio <- Max_HIST / Max_Dens

library(ggplot2)
ggplot(df, aes(Total_average))+
  geom_histogram(col = 'black', fill = 'white',binwidth = binwidth)+
  geom_density(aes(y = ..density..*Ratio))+
  scale_y_continuous(sec.axis = sec_axis(~./Ratio, name = "Density"))

enter image description here

Does it answer your question ?

If not, please provide a reproducible example of your dataset by following this tutorial: How to make a great R reproducible example


NB: For some reasons, it seems that geom_histogram and hist function are not counting per bins in the exact same way.... for now, I don't have a good explanation of it.

Community
  • 1
  • 1
dc37
  • 15,840
  • 4
  • 15
  • 32