4

I have data that tells me how many minutes were required to solve a task:

dat = data.frame(a = c(5.5,7,4,20,4.75,6,5,8.5,10,10.5,13.5,14,11))

I plotted a density histogram of the data with the ggplot2 package:

p=ggplot(dat, aes(x=a)) + geom_histogram(aes(y=..density..),breaks = seq(4,20,by=2))+xlab("Required Solving Time")

Now I would like to add labels of the height of every density bar on top of it. I tried to reach this by adding +geom_text(label=..density..). This returns the error

object '..density..' not found

however. Does anyone know what the input of the geom_text() function has to be in my case to get those labels?

A solution without geom_text() is fine too but I would rather prefer to stay within the ggplot2 package.

Alias
  • 149
  • 1
  • 9
  • 2
    Is this what you're after? http://stackoverflow.com/questions/24198896/how-to-get-data-labels-for-a-histogram-in-ggplot2/24199013#24199013 – MrFlick May 07 '16 at 17:16
  • yeah I saw this answer when I searched stackoverflow but in my case it's a density histogram and not absolute frequency bars. I wasn't quite able to derive a solution to my problem from that answer... – Alias May 07 '16 at 17:41

3 Answers3

4

You can label the bars using stat_bin with geom="text". stat_bincalculates the counts, which we convert to densities using ..density.., just as for geom_histogram. But by setting geom="text", we display those density values as text. We also need to set the same breaks for geom_histogram and stat_bin so that the density values will match. I've placed the text labels in the middle of the bar by multiplying ..density.. by 0.5 in the label. However, you can of course adjust this however you please.

breaks = seq(4,20,by=2)  

ggplot(dat, aes(x=a)) + 
  geom_histogram(aes(y=..density..), breaks = breaks) + 
  stat_bin(geom="text", aes(label=round(..density..,2), y=0.5*..density..), 
           breaks=breaks, colour="white") +
  xlab("Required Solving Time")

enter image description here

To get the labels just above the bars, you can use:

ggplot(dat, aes(x=a)) + 
  geom_histogram(aes(y=..density..), breaks = breaks) + 
  stat_bin(geom="text", aes(label=round(..density..,2), y=..density..),
           breaks=breaks, vjust = -1) +
  xlab("Required Solving Time")

enter image description here

Jaap
  • 81,064
  • 34
  • 182
  • 193
eipi10
  • 91,525
  • 24
  • 209
  • 285
4

..density.. comes from the stat, so you need to tell this layer to also use a binning statistic,

p + geom_text(aes(label=round(..density.., 2), y=..density..), 
              stat="bin", breaks = seq(4,20,by=2), 
              col="white", vjust=1)

enter image description here

baptiste
  • 75,767
  • 19
  • 198
  • 294
2

You can do it with ggplot_build():

library(ggplot2)
dat = data.frame(a = c(5.5,7,4,20,4.75,6,5,8.5,10,10.5,13.5,14,11))
p=ggplot(dat, aes(x=a)) + 
   geom_histogram(aes(y=..density..),breaks = seq(4,20,by=2))+xlab("Required Solving Time")

ggplot_build(p)$data
#[[1]]
#          y count  x xmin xmax    density ncount ndensity PANEL group ymin       ymax colour   fill size linetype alpha
#1 0.19230769     5  5    4    6 0.19230769    1.0     26.0     1    -1    0 0.19230769     NA grey35  0.5        1    NA
#2 0.03846154     1  7    6    8 0.03846154    0.2      5.2     1    -1    0 0.03846154     NA grey35  0.5        1    NA
#3 0.07692308     2  9    8   10 0.07692308    0.4     10.4     1    -1    0 0.07692308     NA grey35  0.5        1    NA
#4 0.07692308     2 11   10   12 0.07692308    0.4     10.4     1    -1    0 0.07692308     NA grey35  0.5        1    NA
#5 0.07692308     2 13   12   14 0.07692308    0.4     10.4     1    -1    0 0.07692308     NA grey35  0.5        1    NA
#6 0.00000000     0 15   14   16 0.00000000    0.0      0.0     1    -1    0 0.00000000     NA grey35  0.5        1    NA
#7 0.00000000     0 17   16   18 0.00000000    0.0      0.0     1    -1    0 0.00000000     NA grey35  0.5        1    NA
#8 0.03846154     1 19   18   20 0.03846154    0.2      5.2     1    -1    0 0.03846154     NA grey35  0.5        1    NA


p + geom_text(data = as.data.frame(ggplot_build(p)$data), 
              aes(x=x, y= density , label = round(density,2)), 
              nudge_y = 0.005)
Jaap
  • 81,064
  • 34
  • 182
  • 193
Cabana
  • 419
  • 2
  • 7
  • 1
    In comments on [this question,](http://stackoverflow.com/questions/20622332/documentation-on-internal-variables-in-ggplot-esp-panel) Hadley "strongly advise[s] against" using internal variables like PANEL, which I see as one of the columns in the ggplot_build() output. Are the other variables in ggpplot_build() here like density considered safer to use? – Max Power Dec 08 '16 at 16:28
  • 1
    Or maybe ggplot_build(p)$data$PANEL is not the "internal" PANEL and is safe to use? [The docs](https://www.rdocumentation.org/packages/ggplot2/versions/2.1.0/topics/print.ggplot?) seem to suggest that ggplot_build() should be as reliable as anything, since it's returned (invisibly) by print.ggplot. And Hadley's warning that I link to above is from 2013... – Max Power Dec 08 '16 at 16:37