-1

enter image description here

     City
             2018-2019
                      2019-2020
                                2020-Present
1   Amritsar    0.0365  0.0205  0.0284
2   Jalandhar   0.0034  0.0031  0.0020
3   Ludhiana    0.0238  0.0235  0.0151
4   Moga        0.0105  0.0038  0.0202
5   Pathankot   0.0157  0.0013  0.0070
6   Phagwara    0.0100  0.0100  0.0114

I need to code a grouped bar graph such that the City names are on the horizontal axis and for each city, I can see three bars corresponding to the rejection rates for each of the three years.

It would be super helpful for me to be walked through how to code this in R.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Welcome to the site! This looks like a duplicate question, so take a look at [this](https://stackoverflow.com/questions/48591363/two-bars-next-to-each-other-using-geom-bar/48591449#48591449) and see if that works for ya – Punintended Jun 11 '20 at 18:36
  • It is almost definitely a duplicate, searching SO for [`[r] grouped bar plot`](https://stackoverflow.com/search?q=%5Br%5D+grouped+bar+plot) has many examples. Ananya, it can be difficult to search for relevant questions, but it can save a lot of time to do that before asking a question. Adding the `[r]` literal (brackets around `r`) helps on SO because it forces the R programming language tag, which narrows down the search significantly. – r2evans Jun 11 '20 at 18:38
  • 1
    I think the hardest time you're going to have is that column names shouldn't start with a number in R, so you're going to have to fix the automatic changes that occur when you try to import the data. – Ian Campbell Jun 11 '20 at 18:41
  • Does this answer your question? [stacked bars within grouped bar chart](https://stackoverflow.com/questions/13486501/stacked-bars-within-grouped-bar-chart) – GordonShumway Jun 11 '20 at 20:04

1 Answers1

0

This turned out to be pretty easy if you're importing from Excel (which is what that screenshot looks like?). Just use the readxl package and it names the columns appropriately.

library(readxl)
data <- readxl::read_xlsx("data.xlsx")
data
## A tibble: 6 x 4
#  City      `2018-2019` `2019-2020` `2020-Present`
#  <chr>           <dbl>       <dbl>          <dbl>
#1 Amritsar       0.0365      0.0205         0.0284
#2 Jalandhar      0.0034      0.0031         0.002 
#3 Ludhiana       0.0238      0.0235         0.0151
#4 Moga           0.0105      0.0038         0.0202
#5 Pathankot      0.0157      0.0013         0.007 
#6 Phagwara       0.01        0.01           0.0114

Now we need to make the data longer so it can be plotted by ggplot:

library(dplyr)
library(tidyr)
data %>%
  pivot_longer(-City, names_to = "Period", values_to = "Value")
## A tibble: 18 x 3
#   City      Period        Value
#   <chr>     <chr>         <dbl>
# 1 Amritsar  2018-2019    0.0365
# 2 Amritsar  2019-2020    0.0205
# 3 Amritsar  2020-Present 0.0284
#...
#16 Phagwara  2018-2019    0.01  
#17 Phagwara  2019-2020    0.01  
#18 Phagwara  2020-Present 0.0114

Then we can use ggplot. The aes function tells ggplot what to use for what property of the graph. We can tell it to put City on the x axis, Value on the y axis, and change the bar fill based on the Period. geom_bar tells ggplot we want a bar plot. position = "dodge" means we want grouped bars. stat = "identity" means we don't want any transformations of the data. These arguments are things you'll just have to look up until you're more experienced.

library(ggplot2)
data %>%
  pivot_longer(-City, names_to = "Period", values_to = "Value") #%>%
ggplot(aes(x = City, y = Value, fill = Period)) +
  geom_bar(stat = "identity", position = "dodge") + 
  labs(y = "Rejection Rate")

enter image description here

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57