I am finding it incredibly difficult to add labels to bar plots using ggplot2. I am working with the titanic dataset, and am having to create additional data frames just for the sake of adding labels - and this whole thing is arduous and driving me crazy.
This is what the basic code and chart looks like:
titanic %>% ggplot(aes(x=Sex, fill=Survived))+
geom_bar() +
scale_fill_discrete(name=NULL,labels=c("Dead", "Survived")) +
labs(y="Number of Passengers", title="Titanic Survival Rates by Sex")
As you can see, there are no labels on the bars. Because there is no "y" variable in the aesthetic mappings, the geom_text(aes(label= xxx))
layer does not work. Also, without a "y" variable, geom_bar(stat="identity")
doesn't work. This is what I did to get around the problem:
# Create a data frame from a two-way table including Survived and Sex
>table(titanic$Survived,titanic$Sex)
female male
0 81 468
1 233 109
rates_by_sex<-data.frame(Sex=c("Female","Male"), Dead=c(81,468), Survived=c(233,109))
# Convert data frame to long format
>rates_by_sex_long <- melt(rates_by_sex, id="Sex")
Sex variable value
1 Female Dead 81
2 Male Dead 468
3 Female Survived 233
4 Male Survived 109
ggplot2 can now make use of geom_text()
and aes(label=value)
rates_by_sex_long %>% ggplot(aes(x=Sex, y=value, fill=variable)) +
geom_bar(stat="identity") +
geom_text(aes(label=value), position = position_stack(vjust=0.5),colour = "white", size = 5) +
scale_fill_discrete(name=NULL) +
labs(y="Number of Passengers",title="Titanic Survival Rates by Sex")
Now this gives me the following chart with labels:
Here is another one I did using the same arduous method just to show the percentages:
# Manually create a data frame with the rate of survival.
table(titanic$Survived) # Gives raw counts of each category
100*round(prop.table(table(titanic$Survived)),4) # Survival rate in percentages
titanic_survival_rate<-data.frame(Survived=c("Yes","No"),Number=c(342,549), Percent=c(38.38,61.62))
titanic_survival_rate %>% ggplot(aes(x=Survived, y=Number)) +
geom_bar(stat="identity",fill="steelblue", colour = "black") +
geom_text(aes(label=paste0(Percent,"%")),nudge_y=25,colour = "black", size = 4) +
labs(y="Number of Passengers",title="Titanic Survival Rate")
Doing it this way is highly inefficient. There are so many charts to be made, and constructing data frames for each of them separately is going to be impractical and impossible. I don't even know what I will do when faceting.
Question: How can I get the labels (counts and percentages) for barplots with a categorical variable? I know it can be done with some additional coding (i.e., adding something to geom_text()
) but I can't quite figure it out.
Please feel free to use this reproduceable code:
df<-data.frame(survived=c(1,1,0,0,0,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0),sex=c("M","F","M","M","M","F","F","F","M","M","M","F","F","F","M","F","M","F","F","M","M","M","M","M","M","M","M"))
df$survived<-as.factor(df$survived)
df %>% ggplot(aes(x=sex, fill=survived))+geom_bar()+geom_text(aes(label=???))