25

I'm a novice with ggplot2 and have a question about generating a stacked bar plot. I checked the book and the dedicated webpage, but can't solve the problem. I have two factors, one of which has 2 levels (presence-absence), the other 10 levels. Lets call these two "variable" and "fruit".

I'd like to create a stacked bar plot where each bar reflects a type of fruit and the number of presence-absence observations in "variable" are stacked on top of each other. This is relatively easy (see code for plot 1 below), but I would also like the bars and y axis to express the number of counts of presence-absence in "variable" as a percentage. In other words, all the bars should be the same height (reflecting a total of 100%) and the counts of presence-absence observations should be converted into percentages.

I can change the y axis scale to a percentage using ..count..*100/sum(..count..) but I can't fathom how to convert the actual bars. I created another plot with faceting (code for plot 2 below) that achieves what I want in terms of percentages, but I would prefer the two bars on top of each other. Does anyone have an idea of how to achieve this? I've provided dummy data and reproducible example. Thanks for any help.

Steve

dat <- data.frame( fruit=c("Apple", "Apple", "Orange", "Orange", "Orange", "Orange",
                   "Orange", "Pear", "Pear", "Pear"), variable=c("Present", "Absent",
                   "Present", "Present", "Present", "Present", "Absent", "Absent",
                   "Absent", "Present") )  

# stacked bar plot  
ggplot(dat, aes(x = fruit, fill = variable) ) +  
    geom_bar( aes(y = ..count..*100/sum(..count..) ) ) +
    xlab("Fruit") +
    ylab("Would like this to be percentage") + 
    scale_fill_manual("Condition", values = alpha( c("firebrick", "dodgerblue4"), 1) )  

enter image description here

# with faceting  
ggplot(dat, aes(x = variable, fill = variable) ) +   
    geom_bar( aes(y = ..count..*100/sum(..count..) ) ) +   
    facet_grid(. ~ fruit) +  
    xlab("Fruit") +
    ylab("Would like this to be percentage") + 
    scale_fill_manual("Condition", values = alpha( c("firebrick", "dodgerblue4"), 1) )  

enter image description here

PatrickT
  • 10,037
  • 9
  • 76
  • 111
Steve
  • 5,727
  • 10
  • 32
  • 30

1 Answers1

26

For the first graph, just add position = 'fill' to your geom_bar line !. You don't actually need to scale the counts as ggplot has a way to do it automatically.

ggplot(dat, aes(x = fruit)) + geom_bar(aes(fill = variable), position = 'fill')
Ramnath
  • 54,439
  • 16
  • 125
  • 152
  • Thanks Ramnath, that's exactly what I need for the bars. When I do that, however, the y axis labels revert to a 0 to 1 scale. I'd like them to be 0 to 100. Including y = ..counts..*100 or y = ..density..*100 in "aes" doesn't seem to work. Any ideas? – Steve Sep 01 '10 at 15:54
  • 4
    `+ scale_y_continuous("",formatter="percent")`. The initial `""` gets rid of the "count" label, but you could include any label you want. – James Sep 01 '10 at 16:05
  • 1
    The plot now looks great, but in my real dataset there are NAs, ggplot interprets these as another factor level by default. Is there any way to turn this off (or remove NAs) within the plot function, so that the plot ignores NAs and just plots the other two levels out of 100%? Thanks. – Steve Sep 01 '10 at 20:22
  • 1
    Steve. If you want to remove all NAs, then you can use na.omit(data) in the ggplot call. That would pass a data frame with all NAs removed. – Ramnath Sep 02 '10 at 16:27
  • @Ramnath, thanks. This works. However, how can I make ggplot calculate the percentage for each data-set separately. I want to draw something like stacked histograms. But the counts are disparate, so i want to plot the histogram of percentages. Can that work? – Sam Mar 02 '11 at 18:54
  • 3
    See here for change in syntax regarding the percent formatter. The above comment no longer works: http://stackoverflow.com/questions/10146109/formatter-argument-in-scale-continuous-throwing-errors-in-r-2-15 – atomicules May 17 '12 at 10:28
  • @Ramnath, Would you please explain how to use ``facet_grid`` in this situation? simply adding ``+ facet_grid(. ~ fruit) `` to the plot doesn't work as expected, what is the alternative to: ``ggplot(dat, aes(x = fruit)) + geom_bar(aes(fill = variable), position = 'fill') + facet_grid(. ~ fruit)`` ? – PatrickT May 17 '16 at 09:39