1

I have a dataset like this:

year      city       type   sex  number
2008      London      A      F    100
2008      London      B      F    110
2008      London      A      M    101
2008      London      B      M    111
2009      London      A      F    200
2009      London      B      F    210
2009      London      A      M    201
2009      London      B      M    211
2008      NY          A      F    100
2008      NY          B      F    110
2008      NY          A      M    101
2008      NY          B      M    111
2009      NY          A      F    200
2009      NY          B      F    210
2009      NY          A      M    201
2009      NY          B      M    211

I want to plot them in a way that for each year I have sum of F and M as two part of a stack plot and showing the percent of each item.

How can I do this in R?

mans
  • 17,104
  • 45
  • 172
  • 321

1 Answers1

2

We can do this with tidyverse approach

  1. Group by 'year', 'sex' columns
  2. Get the sum of 'number' in summarise
  3. Create a column 'perc' by dividing the summarised with the sum of the column
  4. Specify the x as 'year', y as sum of 'number', fill as 'sex', and 'perc' for label in aes of ggplot
  5. Use geom_col to return a bar plot
  6. Add the percentage label with geom_text
library(dplyr)
library(ggplot2)
df1 %>% 
    group_by(year, sex) %>% 
    summarise(number = sum(number), .groups = 'drop') %>% 
    mutate(perc =  number/sum(number), year = factor(year)) %>% 
    ggplot(aes(x = year, y = number, fill = sex, 
            label = scales::percent(perc))) + 
      geom_col(position = 'dodge') + 
      geom_text(position = position_dodge(width = .9),  
              vjust = -0.5,   
               size = 3) +      
      theme_bw()

-output

enter image description here

data

df1 <- structure(list(year = c(2008L, 2008L, 2008L, 2008L, 2009L, 2009L, 
2009L, 2009L, 2008L, 2008L, 2008L, 2008L, 2009L, 2009L, 2009L, 
2009L), city = c("London", "London", "London", "London", "London", 
"London", "London", "London", "NY", "NY", "NY", "NY", "NY", "NY", 
"NY", "NY"), type = c("A", "B", "A", "B", "A", "B", "A", "B", 
"A", "B", "A", "B", "A", "B", "A", "B"), sex = c("F", "F", "M", 
"M", "F", "F", "M", "M", "F", "F", "M", "M", "F", "F", "M", "M"
), number = c(100L, 110L, 101L, 111L, 200L, 210L, 201L, 211L, 
100L, 110L, 101L, 111L, 200L, 210L, 201L, 211L)), 
class = "data.frame", row.names = c(NA, 
-16L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • This is works. How can I move the precents label into the bar and rotate then 90 degrees? – mans Jun 19 '21 at 18:27
  • @mans In the geom_text, i specified the `position_dodge` and `v_just`. I think you can change those values – akrun Jun 19 '21 at 18:28
  • I played with them but no success till now. doing more study and trial and error to see how I can do this. – mans Jun 19 '21 at 18:30
  • @mans when you say you want to move into the bar. where exactly you want to move. Is it at the middle or top or bottom – akrun Jun 19 '21 at 18:31
  • doesn't really matter, but maybe to middle – mans Jun 19 '21 at 18:33
  • @mans just add `aes(y = 10)` in the `geom_text` to bring all of them at the bottom or change the value accordingly for different placements – akrun Jun 19 '21 at 18:37
  • @mans Also, there are many options mentioned [here](https://stackoverflow.com/questions/47916307/specify-position-of-geom-text-by-keywords-like-top-bottom-left-right) – akrun Jun 19 '21 at 18:39