0

I have a problem with ggplot's geom_bar.

I have several bar charts rendered from several variables all using the same fourth variable as a fill: Three stacked bar charts using "Cohort" as "fill". In one barchart the bars for cohort 2 are misaligned.

For some reason in the third chart the columns for cohort 2 are misaligned. All three charts use the same Dataset and the same code.

library(tidyverse)
library(patchwork)
myColours <- c("#A71C49","#11897A","#DD4814", "#282A36")


DataSet <- structure(list(`Var1` = c(3, 2, 5, 3, 4, 1, 3, 1, 
                                     5, 4, 5, 3, 5, 5, 5, 4, 4, 5, 4, 5, 5, 5, 5, 1, 5, 5, 4, 4, 3, 
                                     5, 5, 4, 1, 3, 5, 2, 5, 5, 4, 4, 2, 5, 1, 5, 3, 5, 5, 5, 2, 5, 
                                     3, 1, 5, 5, 5, 5, 4), `Var2` = c(3, 1, 4, 1, 2, 2, 
                                                                      3, 1, 3, 3, 3, 3, 2, 5, 5, 1, 4, 4, 5, 5, 4, 5, 3, 2, 3, 5, 2, 
                                                                      3, 3, 5, 5, 2, 1, 3, 4, 2, 4, 5, 3, 3, 5, 3, 1, 4, 3, 5, 3, 4, 
                                                                      2, 4, 1, 4, 4, 5, 1, 3, 3), `Var3` = c(3, 2, 1, 
                                                                                                             3, 1, 4, 3, 2, 4, 3, 3, 3, 5, 3, 3, 3, 3, 5, 5, 5, 3, 3, 3, 4, 
                                                                                                             5, 2, 4, 4, 4, 5, 5, 1, 1, 3, 5, 2, 5, 5, 3, 4, 3, 1, 1, 4, 3, 
                                                                                                             2, 5, 4, 2, 4, 4, 1, 4, 5, 1, 5, 2), Cohort = structure(c(1L, 
                                                                                                                                                                       1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
                                                                                                                                                                       3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
                                                                                                                                                                       3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
                                                                                                                                                                       4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1", "2", "3", "4"
                                                                                                                                                                       ), class = "factor")), row.names = c(NA, -57L), class = c("tbl_df", 
                                                                                                                                                                                                                                 "tbl", "data.frame"))


c5 <- ggplot(DataSet, aes(`Var1`, fill=Cohort)) +
  geom_bar() +
  theme(legend.position = "none")  +
  ylim(0,25) +
  scale_fill_manual(values=myColours)

c6 <- ggplot(DataSet, aes(`Var2`, fill=Cohort)) +
  geom_bar() +
  theme(legend.position = "none")  +
  ylim(0,25) +
  scale_fill_manual(values=myColours)

c7 <- ggplot(DataSet, aes(`Var3`, fill=Cohort)) +
  geom_bar() +
  ylim(0,25) +
  scale_fill_manual(values=myColours)

(c5 | c6 ) /
(c7 | guide_area())

I have the following error messages:

1: Removed 2 rows containing missing values (geom_bar). 
2: position_stack requires non-overlapping x intervals 

The missing values refer to the graph for Var1, the non-overlapping x intervals for the third graph. If I render out just Cohort two I also get these weirdly misaligned bars: the same misaligned as before but only cohort 2 is visible

And Cohort 3 Var 3 to compare A nice barchart

I would have suspected the fact, that there are only two different numbers in cohort 2, but it works for the barchart with Var1 above. It is also not patchwork buggering it up, as it is the same when I render out just the Var3 barchart. It is also not the legend only being rendered for the Var3 Graph

Does anyone have an idea what the problem is or how I can force ggplot to align the bars correctly?

Thank you!

(R version 4.0.4 Patched (2021-02-17 r80030); tidyverse v.1.3.1; patchwork v.1.1.1)

Jan
  • 53
  • 8
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. – MrFlick Jun 22 '21 at 07:18
  • Thanks, good point. I added data and code. It should be reproducible now. – Jan Jun 22 '21 at 07:34
  • 2
    The problem does not reproduce for me. If I run the code you've provided I only get the first error message, not the second. The plot looks like [this](https://i.ibb.co/p1vDtnD/cohort-plot.png). (Edit: perhaps you should update your libraries or R version?) – A. Stam Jun 22 '21 at 07:38
  • 2
    Not able to reproduce your issue using ggplot2 3.3.4. BTW: Your errors are just warnings. The first arises as you limit the y-scale so that some observations get dropped. – stefan Jun 22 '21 at 07:39
  • Thanks for the reproductions. Damn it might actually be the versions then. I will try to update then. Hope that gets past my IT. – Jan Jun 22 '21 at 07:48

1 Answers1

2

The third graph is misaligned as stacked barplots interpret the value as numerical, making it unsuitable for stacking (do you stack x=1 and x= 1.00001 on top of each other etc.?). Transforming it to an ordered vector helps ggplot understand.

Consider this example only using the tidyverse:

myColours <- c("#A71C49","#11897A","#DD4814", "#282A36")

# Lenthen the dataset
DataSet2 <- DataSet %>% pivot_longer(cols = -Cohort,names_to = "Variable")

# This helps against the non-overlapping x intervals issue
DataSet2$value <- as.ordered(DataSet2$value)

ggplot(DataSet2,aes(x=value,fill=Cohort)) +
  geom_bar(position= position_stack()) + ylim(0,25)+
  facet_wrap(vars(Variable)) +  # make multiple graphs split by the column "Variable"
  scale_fill_manual(values=myColours)

Result:

enter image description here

Sandwichnick
  • 1,379
  • 6
  • 13
  • Hi, Thank you this works! The "as.ordered" command is really handy and alsow roks with my graphs (which I prefer due to other things going on). Thank you very much. Marked as Solution – Jan Jun 22 '21 at 08:39