By default, the histogram is centered at 0, and the first bars xlimits are at 0.5*binwidth
and -0.5*binwidth
. From there, the bars continue with width = binwidth
in both directions until they hit the minimum and maximum. Or, if you data is all > 0, they start at the first (x+0.5)*binwidth
that contains data.
For your example (using a set.seed for reproducibility):
set.seed(1)
x <- rnorm(25)
binwidth <- (range(x)[2]-range(x)[1])/10
p <- ggplot(data.frame(x=x), aes(x = x)) +
geom_histogram(aes(y = ..density..), binwidth = binwidth)
We can get the breaks out by using:
x1 <- ggplot_build(p)$data
giving us our breaks:
x1[[1]]$x
[1] -2.4764874 -2.0954894 -1.7144913 -1.3334932 -0.9524952 -0.5714971 -0.1904990 0.1904990 0.5714971
[10] 0.9524952 1.3334932 1.7144913 2.0954894
So, to get the minimum, we need to round the lowest value of the data to a multiple of binwidth + 0.5 (NB I'm sure there is a better formula, but this works):
binwidth*(floor((min(x)-binwidth/2)/binwidth)+0.5)
-2.476487
similarly the maximum is:
binwidth*(ceiling((max(x)+binwidth/2)/binwidth)+0.5)
2.095489