2

I would like to add percentage labels to a percentage barplot

I found solutions with position="fill" (Add percentage labels to a stacked barplot) and also here (How to draw stacked bars in ggplot2 that show percentages based on group?), however, I would like to keep relative frequencies for every group.

Here is an example plot:

# library
library(ggplot2)

# data  
df <- data.frame(group=c("A","A","A","A","B","B","B","C","C"),
                   anon=c("yes","no","no","no","yes","yes","no","no","no"))

# percentage barplot
  ggplot(df, aes(group),fill=anon) + 
    geom_bar(aes(y = (..count..)/sum(..count..),fill=anon)) + 
    scale_y_continuous(labels=scales::percent) +
    ylab("relative frequencies")

Created on 2020-04-19 by the reprex package (v0.3.0)

Now I would like to add percentage labels to each red and green portion of each bar,so that I get "relative-relative" (e.g. 25% for "yes" for group A) values. How can this be done? Do I have to change my df for this or is this somehow possible within the ggplot function

ava
  • 840
  • 5
  • 19

1 Answers1

1

A possible solution is to calculate proportion outside of ggplot2, here I use dplyr for calculating those different proportions:

library(dplyr)

df_calculated <- df %>% count(group, anon) %>%
  mutate(Percent_col = n / sum(n)) %>%
  group_by(group) %>%
  mutate(Percent = n/sum(n))

# A tibble: 5 x 5
# Groups:   group [3]
  group anon      n Percent_col Percent
  <fct> <fct> <int>       <dbl>   <dbl>
1 A     no        3       0.333   0.75 
2 A     yes       1       0.111   0.25 
3 B     no        1       0.111   0.333
4 B     yes       2       0.222   0.667
5 C     no        2       0.222   1    

And then to use geom_col instead of geom_bar to draw bargraph and geom_text to add text label of each proportion:

library(dplyr)
library(ggplot2)

ggplot(df_calculated, aes(x = group, y = Percent_col, fill = anon))+
  geom_col()+
  scale_y_continuous(labels=scales::percent) +
  ylab("relative frequencies")+
  geom_text(aes(label = scales::percent(Percent)), position = position_stack(0.5))+
  geom_text(inherit.aes = FALSE, 
            data = df_calculated %>% 
              group_by(group) %>% 
              summarise(Sum = sum(Percent_col)),
            aes(label = scales::percent(Sum), 
                y = Sum, x = group), vjust = -0.5)

enter image description here

Does it answer your question ?

dc37
  • 15,840
  • 4
  • 15
  • 32
  • This solves my question. Just because I am not very familiar with calculating these proportions outside of ggplot, is there a possibility to additionaly add the % of each col on top of each of the cols? – ava Apr 19 '20 at 19:17
  • 1
    Do you want the percentage of each group ? (like 44,44 for A, 33,33 for B, 22.22 for C ?) Or do you want the current percentage displayed in bars to be on top of each yes / no cols ? – dc37 Apr 19 '20 at 19:19
  • The percentage of each group (44,44 for A, 33,33 for B and 22,22 for C) – ava Apr 19 '20 at 19:20
  • 1
    Ok, please see my updated answer. Let me know if it is what you are looking for. – dc37 Apr 19 '20 at 19:36
  • Yes it is great, thanks! Just for my understanding, is it usually better to calculate such measurements outside of `ggplot`? – ava Apr 20 '20 at 13:09
  • 1
    You're welcome ;) To my opinion, it is easier to perform calculation outside because it allow you to verify that the calculation is correct. Here especially as you have multiple group to calculate, I would say it is even easier outside of `ggplot2`. But, I think some people prefer to use `..count..` but I'm not very good at it. – dc37 Apr 20 '20 at 17:27