60

Is there a way to set a constant width for geom_bar() in the event of missing data in the time series example below? I've tried setting width in aes() with no luck. Compare May '11 to June '11 width of bars in the plot below the code example.

colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)

colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)
d<-aggregate(iris$Sepal.Length, by=list(iris$Month, iris$Species), sum)
d$quota<-seq(from=2000, to=60000, by=2000)
colnames(d) <- c("Month", "Species", "Sepal.Width", "Quota")
d$Sepal.Width<-d$Sepal.Width * 1000
g1 <- ggplot(data=d, aes(x=Month, y=Quota, color="Quota")) + geom_line(size=1)
g1 + geom_bar(data=d[c(-1:-5),], aes(x=Month, y=Sepal.Width, width=10, group=Species, fill=Species), stat="identity", position="dodge") + scale_fill_manual(values=colours)

plot

tcash21
  • 4,880
  • 4
  • 32
  • 39
  • 1
    There is a similar issue [here](https://github.com/hadley/ggplot2/issues/235) however it is dealing only with `stats` that cannot handle the width parameter. `position='dodge'` seems to have the same failing. Someone with a bit more `ggplot` knowledge may want to weight in, but this sounds like a potential bug. – Justin Jun 13 '12 at 18:14
  • I came across that issue as well. Good to know. For now, I'll use the workaround posted below by filling in values with NA. – tcash21 Jun 13 '12 at 20:03
  • In his reply to https://github.com/tidyverse/ggplot2/issues/1776, Hadley says: _That's how dodging works. You might want to try facetting instead._ BTW, this issue has been adressed already several times on SO: [here](http://stackoverflow.com/q/12806260/3817004) and [here](http://stackoverflow.com/q/15367762/3817004), e.g. – Uwe Jan 17 '17 at 09:24
  • 9
    Because the google tends to bring us here when we search for ``geom_bar +width +fixed``, I would like to point out this rather little known trick: ``geom_bar(position = position_dodge(preserve = "single"))`` – PatrickT Oct 17 '17 at 18:34
  • 1
    There is a [new dodging algorihm](https://github.com/tidyverse/ggplot2/commit/6c91c1d3a835e952b0da97f9117fc760aa162819) in ggplot. The current release (2.2.1 Nov-2017) does not yet contain it. – jnas Nov 23 '17 at 08:05

3 Answers3

46

Some new options for position_dodge() and the new position_dodge2(), introduced in ggplot2 3.0.0 can help.

You can use preserve = "single" in position_dodge() to base the widths off a single element, so the widths of all bars will be the same.

ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) + 
     geom_line(size = 1) + 
     geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species), 
              position = position_dodge(preserve = "single") ) + 
     scale_fill_manual(values = colours)

Using position_dodge2() changes the way things are centered, centering each set of bars at each x axis location. It has some padding built in, so use padding = 0 to remove.

ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) + 
     geom_line(size = 1) + 
     geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species), 
              position = position_dodge2(preserve = "single", padding = 0) ) + 
     scale_fill_manual(values = colours)

aosmith
  • 34,856
  • 9
  • 84
  • 118
  • 2
    I fiddled with this option, but I couldn't get it to work with faceted plots---the bars just wouldn't line up. – mikeck Aug 29 '18 at 19:05
34

The easiest way is to supplement your data set so that every combination is present, even if it has NA as its value. Taking a simpler example (as yours has a lot of unneeded features):

dat <- data.frame(a=rep(LETTERS[1:3],3),
                  b=rep(letters[1:3],each=3),
                  v=1:9)[-2,]

ggplot(dat, aes(x=a, y=v, colour=b)) +
  geom_bar(aes(fill=b), stat="identity", position="dodge")

enter image description here

This shows the behavior you are trying to avoid: in group "B", there is no group "a", so the bars are wider. Supplement dat with a dataframe with all the combinations of a and b:

dat.all <- rbind(dat, cbind(expand.grid(a=levels(dat$a), b=levels(dat$b)), v=NA))

ggplot(dat.all, aes(x=a, y=v, colour=b)) +
  geom_bar(aes(fill=b), stat="identity", position="dodge")  

enter image description here

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
  • 4
    I get the same problem when using boxplot, but this approach by padding with NA does not fix my unequal width boxes problem. The NAs are just dropped. Padding with 0 appears to work, but then that makes for a mighty ugly plot that includes inappropriate data. Any suggestions? – Etienne Low-Décarie Nov 04 '13 at 22:09
  • @EtienneLow-Décarie Not offhand. Ask it as a new question (reference this one and show how it doesn't work for boxplots) and maybe someone else can help. – Brian Diggs Nov 04 '13 at 22:16
  • 1
    A note for future users: when applying this solution, be super careful about data types (factors and numerics), otherwise the solution may seem to be "broken" (see the upvoted comment by @EtienneLow-Décarie above). Check [this question](http://stackoverflow.com/questions/28890227/control-column-widths-in-a-ggplot2-graph-with-a-series-and-inconsistent-data) for details. – tonytonov Mar 06 '15 at 16:29
  • 27
    Honestly I don't think changing the data set for making a graph look nice is a good idea. ggplot should do better with missing observations. – user3507584 Jan 27 '16 at 18:53
  • 2
    I found the solution really well done until I didn't realized that what if GREEN will be the NA value and not the RED? in this case, aplying NA values, I have just a emply space between colums, my bars are not "stacked" anymore. Is there some sollution for this? thank! – maycca Apr 29 '16 at 00:12
  • Keeping in mind the valid comments of @JustynaS. and @maycca, the only solution that doesn't mess with the dataset or require much additional work (though will still be quite an eyesore) is to use `facet_wrap` or `facet_grid` on the `month` variable -- it will create the three species such that all bar widths across the grid/facets will be equal – daRknight Nov 30 '16 at 22:07
21

I had the same problem but was looking for a solution that works with the pipe (%>%). Using tidyr::spread and tidyr::gather from the tidyverse does the trick. I use the same data as @Brian Diggs, but with uppercase variable names to not end up with double variable names when transforming to wide:

library(tidyverse)

dat <- data.frame(A = rep(LETTERS[1:3], 3),
                  B = rep(letters[1:3], each = 3),
                  V = 1:9)[-2, ]
dat %>% 
  spread(key = B, value = V, fill = NA) %>% # turn data to wide, using fill = NA to generate missing values
  gather(key = B, value = V, -A) %>% # go back to long, with the missings
  ggplot(aes(x = A, y = V, fill = B)) +
  geom_col(position = position_dodge())

Edit:

There actually is a even simpler solution to that problem in combination with the pipe. Use tidyr::complete gives the same result in one line:

dat %>% 
  complete(A, B) %>% 
  ggplot(aes(x = A, y = V, fill = B)) +
  geom_col(position = position_dodge())
Tino
  • 2,091
  • 13
  • 15