2

I am facing an issue. I want to plot all four variables in RStudio. Where I appear to have 2 groups for 3 variables and a Count. Yet do not have a clue how to do this with ggplot2. On xlim axes shall be age_band and sex. On y axis Count of those admitted and not admitted. I want the legend bellow the overlayed barplot. Bellow I have added the drawn picture due to confidentiality of the analysis and data. Can someone help? I've searched on stackoverflow and could not find a good reproducible code.

And here is 2 types of data I have after manipulation techniques.

First type of data:

 structure(list(age_band = c("0 yrs", "0 yrs", "0 yrs", "0 yrs", 
                       "1-4 yrs", "1-4 yrs", "1-4 yrs", "1-4 yrs", 
                     "10-14 yrs", "10-14 yrs", "10-14 yrs", "10-14 yrs",                              
                      "15-19 yrs", "15-19 yrs", "15-19 yrs","15-19 yrs"), 
            sex = c("Female", "Female", "Male", "Male", "Female", 
                     "Female", "Male", "Male", "Female", "Female", 
                    "Male", "Male", "Female", "Female", "Male", "Male"), 
            patient.class = c("Not Admitted", "ORDINARY ADMISSION", 
                              "Not Admitted", "ORDINARY ADMISSION", "Not 
                               Admitted", "ORDINARY ADMISSION", "Not 
                               Admitted", "ORDINARY ADMISSION", 
                               "Not Admitted", "ORDINARY ADMISSION", "Not 
                                Admitted", "ORDINARY ADMISSION", "Not 
                               Admitted", "ORDINARY ADMISSION", 
                               "Not Admitted", "ORDINARY ADMISSION"), 
            Count = c(5681L, 1458L, 7667L, 2154L, 8040L, 2481L, 11737L, 
                      3601L, 2904L, 938L, 3883L, 1233L, 3251L, 1266L, 
                      2465L, 1031L)), 
            row.names = c(NA, -16L), class = c("tbl_df", "tbl", 
           "data.frame"
         ))

Second type of data:

   structure(list(age_band = c("0 yrs", "0 yrs", "0 yrs", "0 yrs", 
                               "1-4 yrs", "1-4 yrs", "1-4 yrs", "1-4 yrs", 
                               "10-14 yrs", "10-14 yrs", 
                               "10-14 yrs", "10-14 yrs", "15-19 yrs", 
                               "15- 19 yrs", "15-19 yrs", "15-19 yrs"), 
         sex_patient_class = c("female_admitted", "female_not_admitted", 
                                "male_admitted", "male_not_admitted", 
                               "female_admitted", "female_not_admitted", 
                               "male_admitted", "male_not_admitted", 
                               "female_admitted", "female_not_admitted", 
                               "male_admitted", "male_not_admitted", 
                               "female_admitted", "female_not_admitted", 
                               "male_admitted", "male_not_admitted"), 
         Count = c(1458L, 5681L,  2154L, 7667L, 2481L, 8040L, 3601L, 11737L, 
                   938L, 2904L, 1233L, 3883L, 1266L, 3251L, 1031L, 2465L)), 
         row.names = c(NA, -16L), class = c("grouped_df", "tbl_df", "tbl", 
                                            "data.frame"), 
        vars = "age_band", drop = TRUE, indices = list( 0:3, 4:7, 8:11, 
                                                        12:15), 
        group_sizes = c(4L, 4L, 4L, 4L), biggest_group_size = 4L, labels = 
        structure(list(age_band = c("0 yrs", "1-4 yrs", "10-14 yrs", "15-19 
                                     yrs")), 
         row.names = c(NA, -4L), class = "data.frame", vars = "age_band", 
         drop = TRUE))
GaB
  • 1,076
  • 2
  • 16
  • 29
  • Related: [overlay/superimpose grouped bar plots in ggplot2](https://stackoverflow.com/questions/50554501/overlay-superimpose-grouped-bar-plots-in-ggplot2) – Henrik Aug 16 '18 at 10:59
  • Henrik - that seems right. Thank you a lot! – GaB Aug 16 '18 at 11:01

1 Answers1

6

To superimpose the columns of admitted patient onto the non-admitted patients you can filter the data in two ways. I specify the aesthetics at the beginning to have a common fill-legend.

library(tidyverse)

ggplot(my_data2, aes(age_band, Count, fill = sex_patient_class)) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_not_admitted", "female_not_admitted")), 
           position = position_dodge()) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")), 
           position = position_dodge(0.9), width = 0.5) +
  scale_fill_manual(name = "", 
                    breaks = c("male_admitted", "male_not_admitted", 
                               "female_admitted", "female_not_admitted"),
                    labels = c("Male Admitted", "Male Not admitted", 
                               "Female Admitted", "Female Not admitted"), 
                    values = c("grey80", "black", "red", "orange"))

enter image description here

Detailed explanation

The actual superimposing takes place in the two geom_col calls. The order of the calls is important, as the second one is plotted above the first one. Therefore we start with the "back" columns:

With filter we only select the not_admitted patients and use this as the data for geom_col. We don't need to repeat the aesthetics from the initial ggplot-call as there are inherited if not otherwise specified. position_dodge() places the columns next to each other in each age group.

p <- ggplot(my_data2, aes(age_band, Count, fill = sex_patient_class)) +
  geom_col(data = filter(my_data2, sex_patient_class %in% c("male_not_admitted", "female_not_admitted")), 
           position = position_dodge()) 
p

enter image description here

Now to add the other columns on top we change the filter statement to the admitted patients. As we want the "front" columns to be narrower than the "back" columns, we specify the width=0.5.

p + geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")), 
             position = position_dodge(), width = 0.5)

enter image description here

Now we're almost done. To move the "front" columns in the center of the "back" columns, we need to specify the width of the position_dodge(). In this case to center them, the value is 0.9. To be on the "save side" (i.e. to make sure there really centered in front of the back columns) specify the same dodge width for both geom_col-calls. We then change the not so pretty colors (here with the brewer-palette "Paired") and legend information and are done:

p + geom_col(data = filter(my_data2, sex_patient_class %in% c("male_admitted", "female_admitted")), 
             position = position_dodge(0.9), width = 0.5) +
  scale_fill_brewer(name = "", 
                    breaks = c("male_admitted", "male_not_admitted", 
                               "female_admitted", "female_not_admitted"),
                    labels = c("Male Admitted", "Male Not admitted", 
                               "Female Admitted", "Female Not admitted"), 
                    palette = "Paired")

enter image description here

kath
  • 7,624
  • 17
  • 32
  • hey kath, thank you. I actually want the female admitted to be in front of the female not admitted, and male admitted in front of male not admitted. Hopefully my picture was clear enough. And came up with a second type of data to make it easier on people. – GaB Aug 16 '18 at 10:57
  • Hello Kath - Henrick helped and it seems the language I use have to be more relevant . Thank you Kath. overlay/superimpose grouped bar plots in ggplot2 – GaB Aug 16 '18 at 11:03
  • Oh now I get it..... I'm sorry I didn't understand fully what you wanted... – kath Aug 16 '18 at 11:05
  • well, if you want to get points you can reproduce it and will tick it as the right one :) – GaB Aug 16 '18 at 11:06
  • Kath. Thank you a lot! It works amazing! And your code is soo elegant. Found new ways of ggploting – GaB Aug 16 '18 at 11:18
  • 1
    @kath elegant solution. visiting this website each day will "keep the doctor away". From a learning perspective, is it possible for you to add some more text to your answer that explains in detail, how the superimposing is taking place. And yes, I upvoted the answer. – mnm Aug 16 '18 at 12:44
  • 1
    @nilāmbara I added a longer explanation. Thanks for the comment and by revisting this question I also found I had some unnecessary pieces of code in my answer. So we both got something out of this ;) – kath Aug 16 '18 at 13:11
  • thank you both for everything. Apparently, whilst polishing I wanted to comeback and ask about centering the bars. Thank you both nilāmbara and kath. You are starts ! :) – GaB Aug 16 '18 at 13:37
  • stars not starts :)) – GaB Aug 16 '18 at 13:51
  • 1
    @kath thanks a lot for the detailed explanation. Now, I think the answer is complete in reference to an `enriched learning experience`. @gabriel-burcea thank you for asking a `beautiful` question and providing a `minimum reprex`. – mnm Aug 16 '18 at 14:11