-1

I am new to R and trying to create a population pyramid plot similar to the first one here https://klein.uk/teaching/viz/datavis-pyramids/. I have a dataset with two variables sex and age groups that looks like this:

   sex       age_group

1   Male      20-30
2   Female    50-60
3   Male      70-80
4   Male      10-20
5   Female    80-90
...   ...       ...

This is the code I used

ggplot(data = pyramid_graph(x = age_group, fill = sex)) +
geom_bar(data = subset(pyramid_graph, sex == "F")) + 
geom_bar(data = subset(pyramid_graph, sex == "M")) + 
mapping = aes(y = - ..count.. ),
position = "identity") + 
scale_y_continuous(labels = abs) +
coord_flip()

I do not get any errors from R but when I execute this code a blank image is produced.

Can anyone help? Thank you

Assir
  • 11
  • 2
  • Hello Clarissa, welcome to SO. You maximise your chances of getting a useful answer if you post a minimal reproducible example. [This post](https://stackoverflow.com/help/minimal-reproducible-example) may help. Specifically you need to include your input data. For a start, your input dataset needs at least THREE variables: age group, sex, and the value to plot. Secondly, I suspect you have an unmatched closing bracket in your `position="identity")` Thirdly, make `fill=sex` an aesthetic and use only one `geom_bar` with appropriate changes to the `mapping`. – Limey Jul 04 '20 at 09:46
  • There are specialised packages for population pyramids, such as the aptly-named __pyramid__ package. – Edward Jul 04 '20 at 10:19

1 Answers1

1

Using a similar input dataset from the same website that you cite in your question:

# Obtain source data
load(url("http://klein.uk/R/Viz/popGH.RData"))
# Convert to summary table
df <- as_tibble(popGH) %>% 
        mutate(AgeDecade=as.factor(floor(AGE/10)*10)) %>% 
        group_by(SEX, AgeDecade) %>% 
        dplyr::summarise(N=n(), .groups="drop") %>% 
        # A more transparent way of managing the transformation to get "Females to the left".
        mutate(PlotN=ifelse(SEX=="Female", -N, N)) 
# Create the plot
df %>% ggplot() +
   geom_col(aes(fill=SEX, x=AgeDecade, y=PlotN)) +
   scale_y_continuous(breaks=c(-2*10**5, 0, 2*10**5), labels=c("200k", "0", "200k")) +
   labs(y="Population", x="Age group") +
   scale_fill_discrete(name="Sex") +
   coord_flip()

Gives

enter image description here

Note that I've created a new column to create the "females to the left" effect in the plot. Normally, I'd avoid doing that and would rely on the options to the various ggplot functions to achieve the same thing (much as you have attempted to do). However, in this case, I think it's far more transparent (and simple) use the extra column rather than to modify the mapping.

Limey
  • 10,234
  • 2
  • 12
  • 32