0

I am trying to create a bar chart in R from a data frame, which has counts in the y-axis but displays as labels a concatenation of percentages and counts.

My data frame looks as below:

ID    Response
1    No
2    Yes
3    No
..    ..

The end result I would like to have would be a chart as the one below

enter image description here

ALEX.VAMVAS
  • 87
  • 1
  • 10
  • What have you tried so far? Where are you getting stuck? – OTStats Feb 09 '19 at 23:13
  • I have tried creating a cross tab with the counts and the frequencies - as below: U1 <- train %>% group_by(Survived) %>% summarise(count = n()) %>% mutate(perc = count/sum(count)) and then used ggplot2 to plot it - or at least i tried it – ALEX.VAMVAS Feb 09 '19 at 23:18
  • Add your code attempt to your question. – OTStats Feb 09 '19 at 23:19
  • Sorry OTStats - I edited my previous answer – ALEX.VAMVAS Feb 09 '19 at 23:20
  • I have also seen that answer https://stackoverflow.com/questions/24776200/ggplot-replace-count-with-percentage-in-geom-bar/24777521 which gives a close solution but I can't seem to configure the ggplot to achieve my purpose – ALEX.VAMVAS Feb 09 '19 at 23:28
  • The example that you have provided doesn't make any sense. How can the counts for both Yes and No be 320 but there are different percentages? – OTStats Feb 09 '19 at 23:37

3 Answers3

1

I'd try something like the below. It's awesome that you're using summarize and mutate; I guess by habit I sometimes use base functions like table.

library(tidyverse)
resps<-sample(c("yes", "no"), 850, replace=T)

percents<-round(100*table(resps)/length(resps),2)
counts<-as.numeric(table(resps))

plotdat<-data.frame(percents, counts=counts, response=rownames(percents))


plotdat %>% ggplot(aes(response, counts)) +
    geom_col()+
    geom_text(aes(y=counts+10), label=paste(percents,"%  ", counts))
    labs(y="respondents")+
    theme_classic()
Michael Roswell
  • 1,300
  • 12
  • 31
1

This is a helpful solution from another question on SO:

library(ggplot2)
library(scales)
data.frame(response = sample(c("Yes", "No"), size = 100, replace = T, prob = c(0.4, 0.6))) %>% 
  ggplot(aes(x = response)) + 
  geom_bar(aes(y = (..count..)/sum(..count..))) + 
  geom_text(aes(y = ((..count..)/sum(..count..)), 
            label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25) +
  scale_y_continuous(labels = percent) + 
  labs(title = "Proportion of Responses", y = "Percent", x = "Response")

enter image description here

OTStats
  • 1,820
  • 1
  • 13
  • 22
  • Thanks for the response OTStats - unfortunately it does not answer my question as it does not display counts and percentages, but thank you for the effort – ALEX.VAMVAS Feb 09 '19 at 23:35
1

This should get you going:

library(tidyverse)

df %>%
  group_by(Response) %>%
  summarise(count = n()) %>%
  mutate(Label = paste0(count, " - ", round(count / sum(count) * 100, 2), "%")) %>%
  ggplot(aes(x = Response, y = count)) +
  geom_bar(stat = 'identity', fill = 'lightblue') +
  geom_text(aes(label = Label)) +
  theme_minimal()

A solution as above can be to create a Label column which you can then pass to geom_text if needed.

A dummy data frame:

df <- data.frame(
  ID = c(1:100),
  Response = c(rep("Yes", 60), rep("No", 40))
)
arg0naut91
  • 14,574
  • 2
  • 17
  • 38