3

I have problems filling the bars while grouping with facet_wrap Using this data.frame:

library(ggplot2)
library(gridExtra)
set.seed(1234)
testDat <- data.frame(answer=factor(sample(c("yes", "no"), 60, replace=TRUE)),
                      which=factor(sample(c("q1", "q2", "q3"), 60, replace=TRUE)))

I wanted to plot the answer grouped by the variable which. This gives me the absolute values:

ggplot(testDat, aes(x=answer)) + 
  geom_bar(aes(fill=answer)) + facet_wrap(~which)

This gives me the relative values. But not per group:

ggplot(testDat, aes(x=answer)) + 
  geom_bar(aes(y=(..count..)/sum(..count..), fill=answer)) + facet_wrap(~which)

Searching for an answer I detected this to plot the relative values per group. But the fill color doesn't work anymore

ggplot(testDat, aes(x=answer)) + 
  geom_bar(aes(y=(..count..)/sum(..count..), group=which, fill=answer)) + facet_wrap(~which)

It just works for the three different values of 'which' and not of 'answer'

ggplot(testDat, aes(x=answer)) + 
  geom_bar(aes(y=(..count..)/sum(..count..), group=which, fill=which)) + facet_wrap(~which)

Any suggestions for how to fill the bars?

p1<-ggplot(testDat, aes(x=answer)) + geom_bar(aes(y=(..count..)/sum(..count..), group=which, fill=answer)) + facet_wrap(~which)
p2<-ggplot(testDat, aes(x=answer)) + geom_bar(aes(y=(..count..)/sum(..count..), group=which, fill=which)) + facet_wrap(~which)
grid.arrange(p1,p2)
schlusie
  • 1,907
  • 2
  • 20
  • 26

2 Answers2

3

Is this what you had in mind?

library(reshape2)
library(ggplot2)
df <- aggregate(answer~which,testDat,
                function(x)c(yes=sum(x=="yes")/length(x),no=sum(x=="no")/length(x)))
df <- data.frame(which=df$which, df$answer)
gg <- melt(df,id=1, variable.name="Answer",value.name="Rel.Pct.")
ggplot(gg) + 
  geom_bar(aes(x=Answer, y=Rel.Pct., fill=Answer),position="dodge",stat="identity")+
  facet_wrap(~which)

Unfortunately, aggregating functions such as sum(...), min(...), max(...), range(...), etc. etc., when used in aesthetic mappings, do not respect the grouping implied by facets. So, while ..count.. is subsetted properly when used alone (in your numerator), sum(..count..) gives the total for the whole dataset. This is why (..count..)/sum(..count..) gives the fraction of the total, not the fraction of the group.

The only way around that, that I am aware of, is to create an axillary table as above.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • Thanks for the explanation regarding the use of aggregating functions. I found a way around `..count../sum(..count..)` with `..density..)`. My question still remains why `fill=which` works and `fill=answer` doesn't. – schlusie Jan 21 '14 at 09:12
  • Aggregating functions subset correctly for groups defined in aesthetic mappings (e.g, in the call to `aes(...)`). They do not work for groups defined implicitly in facets. So if you have `aes(..., fill=which)` and also `facet__wrap(~which)`, you are defining groups in both places and it will work. But that is a completely different plot. – jlhoward Jan 21 '14 at 15:22
3

There is a way to aggregate using ggplot as requested as mentioned in this question. However, it requires the use of the PANEL variable that isn't documented therefore Hadley recomended not to use it.

Here is a way to aggregate using data.table. I've also added percentage labels to the plot.

grp <- function(x) {
  percentage = as.numeric(table(x)/length(x))
  list(x = levels(x),
       percentage = percentage,
       label = paste0( round( as.numeric(table(x)/length(x), 0 ) * 100 ), "%")
  )
}

require("data.table")
DT <- data.table(testDat)

# Simpler version
ggplot(DT[, grp(answer), by=which]) +
  geom_bar(aes(x=x, y=percentage, fill = x), position="dodge",stat="identity") +
  facet_grid(~which) + 
  xlab("Answer")

# With percentage labels and y axis with percentage
ggplot(DT[, grp(answer), by=which]) +
  geom_bar(aes(x=x, y=percentage, fill = x), position="dodge",stat="identity") +
  geom_text(aes(x=x, ymax = 0.6, y=percentage, label = label), vjust = -1.2, color = "grey20") +
  facet_grid(~which) + 
  xlab("Answer") + xlim("yes", "no") +
  scale_y_continuous(labels = percent_format()) +
  scale_fill_discrete(name = "Answer")

enter image description here

Community
  • 1
  • 1
marbel
  • 7,560
  • 6
  • 49
  • 68