I am learning ggplot2 and I am trying to make a chart with multiple boxplots, one for each age group along the x-axis, showing information about the distribution of total population for each age group along the y-axis. I am following a prompt to create a specific plot (seen below).
Age groups each have a value, 0 (all age groups), to 18 (the highest age group) in the AGEGRP column. There's also a column for TOT_POP (total population). The dataset looks like this:
And this is the chart I am trying to create. I think the trick will be to factor the AGEGRP values.
Here's the code I tried for the plot, but it fails. Would someone please point me in the right direction? Thank you,
library(tidyverse)
popSample <- read.csv("./datafiles/cc-est2018-alldata.csv") %>%
select("STNAME", "CTYNAME", "YEAR", "AGEGRP", "TOT_POP", "TOT_MALE", "TOT_FEMALE")
ageGroups <- ggplot(popSample, aes(y=TOT_POP)) + +
geom_boxplot(aes(x=AGEGRP), fill = "red", alpha = 0.5, color="darkred") +
labs(title ="Populations", x = "Distribution", y = "Age Groups") +
theme_light()
ageGroups