0

I have a data frame that lists 4 geographical regions and the size of the population for each age group from 60 - 90+. I know the count of each age i.e. there are 7749 60-year-olds, so I should be able to get the average age of this region. But, I can not work out what formula to use to calculate it without creating a data frame that lists individual counts.

How do I get r to recognise the count for each group? and find the mean?

   County   Pop Age
East Sussex 7749    60
East Sussex 7818    61
East Sussex 7517    62
East Sussex 7390    63
East Sussex 7351    64
East Sussex 6993    65
East Sussex 7239    66
East Sussex 7143    67
East Sussex 7004    68
East Sussex 7177    69
East Sussex 7426    70
East Sussex 7673    71
East Sussex 8446    72
East Sussex 9372    73
East Sussex 7003    74
East Sussex 6532    75
East Sussex 6442    76
East Sussex 5834    77
East Sussex 5146    78
East Sussex 4234    79
East Sussex 4376    80
East Sussex 4349    81
East Sussex 4161    82
East Sussex 3803    83
East Sussex 3491    84
East Sussex 3188    85
East Sussex 2796    86
East Sussex 2539    87
East Sussex 2340    88
East Sussex 2165    89
East Sussex 9216    90+
Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
Jack
  • 31
  • 3
  • I can help you if you provide your dataframe through dput(yourdataframe) and copy it into your post. See here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – pbraeutigm Aug 12 '21 at 10:51

2 Answers2

1

You can calculate weighted.mean here for each County -

library(dplyr)

result <- df %>% 
           mutate(Age = readr::parse_number(Age)) %>%
           group_by(County) %>% 
           summarise(mean_age = weighted.mean(Age, Pop), .groups = 'drop')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Using data.table

library(data.table)
setDT(df)[, .(mean_age = weighted.mean(as.numeric(sub("+", "", Age, fixed = TRUE)), Pop)), by = County]
akrun
  • 874,273
  • 37
  • 540
  • 662