0

I have a faceted plot of probability distributions (histogram of y=..density.. with binwidth=1) as produced below.

library(ggplot2)
library(sqldf)
data(iris)

#Create Binned(Factor) Versions of Continuous Variables
iris$Sepal.Length = round(iris$Sepal.Length)
iris$Sepal.Width <- cut(iris$Sepal.Width, breaks = 4)

#Plot Probability Distributions of Sepal.Lengths, Separate Plots for Each Width-bin. 

p <-ggplot(iris, aes(x=Sepal.Length, y=..density..)) +
  geom_histogram(binwidth=1, alpha=0.5, position = 'identity', aes(fill=Species)) +
  facet_wrap(~Sepal.Width) 

p

enter image description here

I would like to add a geom_text() or geom_label() for a summary stat to each facet (e.g. max(density)).

Based on the answer here I tried this code, but got the plot below with each label printed on top of each other in each facet.

#Create DataFrame for Labels out of ggplot_build() data
ggbuild <- as.data.frame(ggplot_build(p)$data)
label.data = sqldf('select PANEL, max(density) as max_density from ggbuild group by PANEL ')

#Print Faceted Plot with Simple Summary Stat in Each Facet
p + geom_text(data=label.data, aes(x=.5, y=.8, label=paste0('Max Height: ', round(max_density,2))))

enter image description here

Based on this answer that "Sometimes it's easier to obtain summaries outside the call to ggplot," I tried the code below:

p + geom_text(aes(x=0, y=.8, label=label.data$max_density)) 

But that got me the following error:

Error: Aesthetics must be either length 1 or the same as the data (100): x, y, label

halfer
  • 19,824
  • 17
  • 99
  • 186
Max Power
  • 8,265
  • 13
  • 50
  • 91
  • 1
    Possibly a duplicate of [this](http://stackoverflow.com/questions/11889625/annotating-text-on-individual-facet-in-ggplot2). In your first attempt, your `data.labels` data frame does not contain the faceting variable. Try adding this: `label.data$Sepal.Width = sort(unique(iris$Sepal.Width))` – Sandy Muspratt Dec 13 '16 at 00:47
  • Hey thanks a lot Sandy. That does add the geom_text labels I wanted. I feel a bit uncomfortable bolting facet groupings back onto the ggplot_build()$data (label.data), since it's not obvious to me the mapping of PANEL to the ordered Sepal.Width would always be correct (though it clearly is here). I think that reinforces that turning to ggplot_build$data in the first place was a hack, and not the best way to plot a faceted plot with labeled stats. – Max Power Dec 13 '16 at 06:11
  • Possibly not. I was showing that the data.label data frame needed the faceting variable. – Sandy Muspratt Dec 13 '16 at 06:34
  • There doesn't seem to be too much wrong with your approach. See [here](http://stackoverflow.com/questions/14584093/ggplot2-find-number-of-counts-in-histogram-maximum) and [here](http://stackoverflow.com/questions/25378184/need-to-extract-data-from-the-ggplot-geom-histogram). But you should modify the sqldf command to get all the required data into the label.data data frame; or use dplyr, or something similar. I don't know how else you would get the maximums. – Sandy Muspratt Dec 13 '16 at 07:01
  • Thanks Sandy, but the sqldf statement is summarizing data in ggplot_build(p)$data (assigned to label.data), which doesn't have info on the faceting variable (Sepal.Width). That's why I said in my comment I was thinking it would be better to summarize the iris data itself by Sepal.Width, instead of taking the ggplot_build() summary, and then adding Sepal.Width information back in. Does that make sense? Or am I still missing something? – Max Power Dec 13 '16 at 16:51
  • 1
    Sorry, my mistake. Not thinking. `ggplot_build(p)$layout$panel_layout` matches 'PANEL' number to 'Sepal.Width'. OR, as you say, begin with a summary table. – Sandy Muspratt Dec 13 '16 at 23:33
  • 1
    Thanks a lot Sandy, that's exactly what I was hoping for. – Max Power Dec 13 '16 at 23:47

0 Answers0