1

I'm trying automate the process of running hundreds of different regression analyses within the same table, based on different fields. I'm using the lm function.

My data has a number of different elections results at the county level. I would like to compare the support for every candidate vs the percentage of voters over the age of 65 years old to see if there's a relationship between the two variables. For example "Is there a relationship between the number of older voters in a county and their support for candidate x?" I have hundreds or different elections - each with multiple candidates - and hundreds of counties for each election. I would like to run regression analysis for each candidate, in every race, for every county; and export a table that gives the slope and intercept for each analysis.

My input table is

contest  county  candidate  percent_over_65  percent_support
1           1        1             .65              .44 
1           1        2             .65              .34 
1           1        3             .65              .22
1           2        1             .70              .60  
1           2        2             .70              .30
1           2        3             .70              .10
2           1        4             .65              .70
2           1        5             .65              .30 
2           2        4             .70              .60 
2           2        5             .70              .40

My ideal output would be something like:

contest  county  candidate  slope_value  intercept_value
1           1         1          .05           .65 
1           1         2         -.01           .23    
1           1         3          .02           .17     
1           2         1          .25           .36             
1           2         2          .15           .45 
1           2         3         -.02           .12
2           1         4          .75           .33   
2           1         5         -.10           .18  

This question and the answer by Hadley towards the bottom with 57 upvotes was very helpful; he used the plyr function that "deconstructed" the process once; But now, I essentially want to nest another plyr function within the original function (if that makes sense). It seems like I could add a couple of for-loops to the mix to get the desired result, but I haven't been able to figure it out. Any help would be much appreciated. Thanks!

Billy B
  • 37
  • 5

0 Answers0