0

Folks,

I am plotting histograms using geom_histogram and I would like to label each histogram with the mean value (I am using mean for the sake of this example). The issue is that I am drawing multiple histograms in one facet and I get labels overlapping. This is an example:

library(ggplot2)
df <- data.frame (type=rep(1:2, each=1000), subtype=rep(c("a","b"), each=500), value=rnorm(4000, 0,1))
plt <- ggplot(df, aes(x=value, fill=subtype)) + geom_histogram(position="identity", alpha=0.4)
plt <- plt +  facet_grid(. ~ type)
plt + geom_text(aes(label = paste("mean=", mean(value)), colour=subtype, x=-Inf, y=Inf), data = df, size = 4, hjust=-0.1, vjust=2)

Result is:

This is the result I get

The problem is that the labels for Subtypes a and b are overlapping. I would like to solve this.

I have tried the position, both dodge and stack, for example:

plt + geom_text(aes(label = paste("mean=", mean(value)), colour=subtype, x=-Inf, y=Inf), position="stack", data = df, size = 4, hjust=-0.1, vjust=2)

This did not help. In fact, it issued warning about the width.

Would you pls help ? Thx, Riad.

Riad
  • 953
  • 3
  • 13
  • 23

2 Answers2

4

I think you could precalculate mean values before plotting in new data frame.

library(plyr)
df.text<-ddply(df,.(type,subtype),summarise,mean.value=mean(value))

df.text
  type subtype   mean.value
1    1       a -0.003138127
2    1       b  0.023252169
3    2       a  0.030831337
4    2       b -0.059001888

Then use this new data frame in geom_text(). To ensure that values do not overlap you can provide two values in vjust= (as there are two values in each facet).

ggplot(df, aes(x=value, fill=subtype)) + 
  geom_histogram(position="identity", alpha=0.4)+
  facet_grid(. ~ type)+
  geom_text(data=df.text,aes(label=paste("mean=",mean.value),
                 colour=subtype,x=-Inf,y=Inf), size = 4, hjust=-0.1, vjust=c(2,4))

enter image description here

Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
  • Thank you so much for your prompt answer. That's exactly what I am looking for ! Riad. – Riad Dec 03 '13 at 22:39
2

Just to expand on @Didzis:

You actually have two problems here. First, the text overlaps, but more importantly, when you use aggregating functions in aes(...), as in:

geom_text(aes(label = paste("mean=", mean(value)), ...

ggplot does not respect the subsetting implied in the facets (or in the groups for that matter). So mean(value) is based on the full dataset regardless of faceting or grouping. As a result, you have to use an auxillary table, as @Didzis shows.

BTW:

df.text <- aggregate(df$value,by=list(type=df$type,subtype=df$subtype),mean)

gets you the means and does not require plyr.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • Great explanation, I now understand why I was getting the same number in all the facets. BTW, I went for the plyr base solution as I needed to calculate the quantiles: `df.text<-ddply(df,.(type,subtype),summarise,mean=mean(value),sd=sd(value),Q0027=quantile(value,0.0027,names=F)) `. Thx again for your comments, very much appreciated – Riad Dec 03 '13 at 22:46