62

I am trying to create a barplot using ggplot2 where I am stacking by one variable and dodging by another.

Here is an example data set:

df=data.frame(
  year=rep(c("2010","2011"),each=4),
  treatment=rep(c("Impact","Control")),
  type=rep(c("Phylum1","Phylum2"),each=2),
  total=sample(1:100,8))

I would like to create a barplot where x=treatment, y=total, the stacked variable is type and the dodged variable is year. Of course I can do one or the other:

ggplot(df,aes(y=total,x=treatment,fill=type))+geom_bar(position="dodge",stat="identity")

ggplot(df,aes(y=total,x=treatment,fill=year))+geom_bar(position="dodge",stat="identity")

But not both! Thanks to anyone who can provide advice.

jslefche
  • 4,379
  • 7
  • 39
  • 50
  • 3
    You can only do one or the other, not both. See my related answer here: http://stackoverflow.com/questions/12592041/plotting-a-stacked-bar-plot/12592235#12592235 – Maiasaura Oct 03 '12 at 19:43

5 Answers5

33

Here's an alternative take using faceting instead of dodging:

ggplot(df, aes(x = year, y = total, fill = type)) +
    geom_bar(position = "stack", stat = "identity") +
    facet_wrap( ~ treatment)

enter image description here

With Tyler's suggested change: + theme(panel.margin = grid::unit(-1.25, "lines"))

enter image description here

M--
  • 25,431
  • 8
  • 61
  • 93
Matt Parker
  • 26,709
  • 7
  • 54
  • 72
  • 1
    Good alternative to wanting both. +1 – Maiasaura Oct 03 '12 at 19:50
  • Hmm, interesting idea. I guess it will have to do! Thanks to both @Maiasaura and Matt Parker – jslefche Oct 03 '12 at 21:02
  • 13
    adding `+ theme(panel.margin = unit(-1.25, "lines"))` can kind of make them more look like they're in the same visual field but still not exactly what the OP was after. Nice best alternative. +1 – Tyler Rinker Oct 03 '12 at 22:53
  • 1
    @TylerRinker Nice - didn't even know such a thing was possible! Just a tip for anyone who follows: I had to use `grid::unit` to get this without loading `grid` directly. – Matt Parker Oct 03 '12 at 23:54
  • This does not answer the question. How could it be the accpeted answer ? – Julien Jul 27 '22 at 14:49
  • 1
    @Julien Most of the answers on this question were posted more than five years after the question was asked, and the question-asker hasn't been active for three years. It's a limitation of StackOverflow. If you prefer a different answer, you should comment on that one and say something like, "This is the best answer for the original question." – Matt Parker Aug 14 '22 at 23:49
  • How can I say "This is the best answer for the original question." if it does not answer the original question? – Julien Aug 15 '22 at 07:00
  • How can I horizontally shift the position of the bars so each facet will have adjacent bars? – arielhasidim Apr 18 '23 at 09:52
  • @arielhasidim Do you mean having the 2010 bars next to each other, then the 2011 bars? In that case, I think you'd just swap `year` and `treatment` in the code above. – Matt Parker Apr 28 '23 at 16:37
12

The closest you can get is by drawing a border around the dodged bars to highlight the stacked type values.

ggplot(df, aes(treatment, total, fill = year)) + 
geom_bar(stat="identity", position="dodge", color="black")

enter image description here

Matt Parker
  • 26,709
  • 7
  • 54
  • 72
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
  • 1
    Hmm, the borders don't appear to line up with the data. For example, `set.seed(8)` before running the code and look at the values. – jslefche Oct 03 '12 at 21:04
  • 1
    If you really wanted to get fancy I bet you could use `geom_rect` to fill in some parts but then you're using ggplot to draw rather than plot. – Tyler Rinker Oct 03 '12 at 22:57
12

You can use interaction(year, treatment) as the x-axis variable as an alternative to dodge.

library(dplyr)
library(ggplot2)


df=data.frame(
  year=rep(c("2010","2011"),each=4),
  treatment=rep(c("Impact","Control")),
  type=rep(c("Phylum1","Phylum2"),each=2),
  total=sample(1:100,8)) %>% 
  mutate(x_label = factor(str_replace(interaction(year, treatment), '\\.', ' / '),
                          ordered=TRUE))

ggplot(df, aes(x=x_label, y=total, fill=type)) +
  geom_bar(stat='identity') +
  labs(x='Year / Treatment')

Created on 2018-04-26 by the reprex package (v0.2.0).

andschar
  • 3,504
  • 2
  • 27
  • 35
Kent Johnson
  • 3,320
  • 1
  • 22
  • 23
  • 1
    interaction seemed only was used to create labels? Then why not just `paste0(year, "/", treatment)`? – dracodoc Oct 10 '19 at 18:22
9

you can play with some alpha:

df %>% 
  group_by(year, treatment) %>% 
  mutate(cum_tot = cumsum(total)) %>% 
  ggplot(aes(treatment, cum_tot, fill =year)) + 
  geom_col(data = . %>% filter( type=="Phylum1"), position = position_dodge(width = 0.9), alpha = 1) +
  geom_col(data = . %>% filter( type=="Phylum2"), position = position_dodge(width = 0.9), alpha = 0.4) +
  geom_tile(aes(y=NA_integer_, alpha = factor(type))) + 
  scale_alpha_manual(values = c(1,0.4))

enter image description here

Now you can add theme(panel.background = element_rect(fill ="yellow")) some background fill to mix the colors:

enter image description here

Finally you have to fix the legend using inkscape.

Roman
  • 17,008
  • 3
  • 36
  • 49
5

It can be done however its tricky/fiddly, you basically have to layer the bar chart.

here is my code:

library(tidyverse)

df=data.frame(
  year=rep(c(2010,2011),each=4),
  treatment=rep(c("Impact","Control")),
  type=rep(c("Phylum1","Phylum2"),each=2),
  total=sample(1:100,8))

# separate the by the variable which we are dodging by so 
# we have two data frames impact and control
impact <- df %>% filter(treatment == "Impact") %>% 
  mutate(pos = sum(total, na.rm=T))

control <- df %>% filter(treatment == "Control") %>% 
  mutate(pos = sum(total, na.rm=T))

# calculate the position for the annotation element
impact_an <- impact %>% group_by(year) %>% 
  summarise(
    pos = sum(total) + 12
    , treatment = first(treatment)
  )

control_an <- control %>% group_by(year) %>% 
  summarise(
    pos = sum(total) + 12
    , treatment = first(treatment)
  )

# define the width of the bars, we need this set so that
# we can use it to position the second layer geom_bar 
barwidth = 0.30

ggplot() +
  geom_bar(
    data = impact
    , aes(x = year, y = total, fill = type)
    , position = "stack"
    , stat = "identity"
    , width = barwidth
  ) + 
  annotate(
    "text"
    , x = impact_an$year
    ,y = impact_an$pos
    , angle = 90
    , label = impact_an$treatment
  ) +
  geom_bar(
    data = control
    # here we are offsetting the position of the second layer bar
    # by adding the barwidth plus 0.1 to push it to the right
    , aes(x = year + barwidth + 0.1, y = total, fill = type)
    , position = "stack"
    , stat = "identity"
    , width = barwidth
  ) +
  annotate(
    "text"
    , x = control_an$year + (barwidth * 1) + 0.1
    ,y = control_an$pos
    , angle = 90
    , label = control_an$treatment
  ) +
  scale_x_discrete(limits = c(2010, 2011))

stacked dodged barchar This doesn't really scale well, however there are ways you could code it up to make it suit your situation, credit where its due I originally learnt this method from the following post: https://community.rstudio.com/t/ggplot-position-dodge-with-position-stack/16425

Michael Gordon
  • 131
  • 1
  • 5
  • I tried something similar, but instead of manually calculating the bar position, you can actually put every layer in a separate set of discrete x axis values, like 2010-a (for impact), 2010-b (for control), 2010-gap (as white space), then override the axis labels. This way you only need to manipulate your data a little bit then draw each layer on each own x values. – dracodoc Oct 10 '19 at 15:16