1

I have the following data: https://raw.githubusercontent.com/fivethirtyeight/data/master/congress-age/congress-terms.csv

I'm trying to determine how to calculate the mean age of members of Congress by year (termstart) for each party (Republican and Democrat).

I was hoping for some help on how to go about doing this. I am a beginner in R and I'm just playing around with the data.

Thanks!

Tati16
  • 55
  • 5
  • 1
    Have you tried answers from https://stackoverflow.com/questions/21982987/mean-per-group-in-a-data-frame or https://stackoverflow.com/questions/11562656/calculate-the-mean-by-group ? – Ronak Shah Oct 15 '20 at 00:07

1 Answers1

1

Try this approach. Make a filter for the required parties and then summarise. After that you can reshape to wide in order to have both parties for each individual date. Here the code using tidyverse functions:

library(dplyr)
library(tidyr)
#Data
df <- read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/congress-age/congress-terms.csv',stringsAsFactors = F)
#Code
newdf <- df %>% filter(party %in% c('R','D')) %>%
  group_by(termstart,party) %>% summarise(MeanAge=mean(age,na.rm=T)) %>%
  pivot_wider(names_from = party,values_from=MeanAge)

Output:

# A tibble: 34 x 3
# Groups:   termstart [34]
   termstart      D     R
   <chr>      <dbl> <dbl>
 1 1947-01-03  52.0  53.0
 2 1949-01-03  51.4  54.6
 3 1951-01-03  52.3  54.3
 4 1953-01-03  52.3  54.1
 5 1955-01-05  52.3  54.7
 6 1957-01-03  53.2  55.4
 7 1959-01-07  52.4  54.7
 8 1961-01-03  53.4  53.9
 9 1963-01-09  53.3  52.6
10 1965-01-04  52.3  52.2
# ... with 24 more rows
Duck
  • 39,058
  • 13
  • 42
  • 84