I've plotted a histograph with wage on the x-axis and a y-axis that shows the percentage of individuals in the data set that has this particular wage. Now I want the individual bars to display how many observarions there is in every bar. e.g in the sample_data I've provided, how many wages is in the 10% bars and how many in the 20% bars?
Here's a small sample of my data:
sample_data<- structure(list(wage = c(81L, 77L, 63L, 84L, 110L, 151L, 59L,
109L, 159L, 71L), school = c(15L, 12L, 10L, 15L, 16L, 18L, 11L,
12L, 10L, 11L), expr = c(17L, 10L, 18L, 16L, 13L, 15L, 19L, 20L,
21L, 20L), public = c(0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L),
female = c(1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L), industry = c(63L,
93L, 71L, 34L, 83L, 38L, 82L, 50L, 71L, 37L)), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")
Here's my R script
library(ggplot2)
library(dplyr)
ggplot(data = sample_data) +
geom_histogram(aes(x = wage, y = stat(count) / sum(count)), binwidth = 4, color = "black") +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format())
I'm happy with this basically, but whatever I try -- I can't get text on top of my columns. Here is one example of many using stat_count that doesn't work:
ggplot(data = sample_data) +
geom_histogram(aes(x = wage, y = stat(count) / sum(count)), binwidth = 4, color = "black") +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format()) +
stat_count(aes(y = ..count.., label =..count..), geom = "text")
Iv'e also tried using geom_text to no avail.
EDIT: ANSWER!
Many thanks too those who replied. I ended up using teunbrand's solution with a small modification where I changed after_stat(density) to after_stat(count) / sum(count).
Here's the 'final' code:
ggplot(sample_data) +
geom_histogram(
aes(x = wage,
y = after_stat(count) / sum(count)),
binwidth = 4, colour = "black"
) +
stat_bin(
aes(x = wage,
y = after_stat(count) / sum(count),
label = after_stat(ifelse(count == 0, "", count))),
binwidth = 4, geom = "text", vjust = -1) +
scale_x_continuous(breaks = seq(0, 300, by = 20)) +
scale_y_continuous(labels = scales::percent_format())