This is the top of my dataset:
state start_date end_date created_at cycle party answer candidate_name pct survey_length
1 Florida 2020-11-02 2020-11-02 6/14/21 15:36 2020 REP Trump Donald Trump 48.0 0 days
2 Iowa 2020-11-01 2020-11-02 11/2/20 09:02 2020 REP Trump Donald Trump 48.0 1 days
3 Pennsylvania 2020-11-01 2020-11-02 11/2/20 12:49 2020 REP Trump Donald Trump 49.2 1 days
4 Florida 2020-11-01 2020-11-02 11/2/20 19:02 2020 REP Trump Donald Trump 48.2 1 days
5 Florida 2020-10-31 2020-11-02 11/4/20 09:17 2020 REP Trump Donald Trump 49.4 2 days
6 Nevada 2020-10-31 2020-11-02 11/4/20 10:38 2020 REP Trump Donald Trump 49.1 2 days
I want to take the average of the 'pct' column, for each month, for each state.
I can filter the data individually and use aggregate() like this:
Alabama <- filter(prep2020, prep2020$state == 'Alabama')
Alabama$end_date <- format(Alabama$end_date, '%m')
AL <- aggregate(Alabama$pct, by=list(Alabama$end_date), mean)
I think the best way would be to write a function that does this for all states, and then use the function in lapply() but I can't seem to figure out how to do that. Any suggestions?