0

I'm using R to work with the US county-level voting data that the good folks at MIT steward. I'd like to know the total votes each candidate got in each county. For some states, such as Wisconsin, that's easy:

"state", "county_name", "county_fips", "candidate", "party", "candidatevotes", "totalvotes", "mode"<br>
"WISCONSIN", "WINNEBAGO", "55139", "JO JORGENSEN", "LIBERTARIAN", 1629, 94032, "TOTAL"

For other states, such as Utah, it's doable:

"UTAH", "WEBER", "49057", "DONALD J TRUMP", "REPUBLICAN", 0, 111403, "EARLY"<br>
"UTAH", "WEBER", "49057", "DONALD J TRUMP", "REPUBLICAN", 0, 111403, "ELECTION DAY"<br>
"UTAH", "WEBER", "49057", "DONALD J TRUMP", "REPUBLICAN", 0, 111403, "MAIL"<br>
"UTAH", "WEBER", "49057", "DONALD J TRUMP", "REPUBLICAN", 65949, 111403, "TOTAL"

South Carolina, however, is problematic:

"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 13656, 144050, "ABSENTEE BY MAIL"<br>
"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 22075, 144050, "ELECTION DAY"<br>
"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 18, 144050, "FAILSAFE"<br>
"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 176, 144050, "FAILSAFE PROVISIONAL"<br>
"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 22950, 144050, "IN-PERSON ABSENTEE"<br>
"SOUTH CAROLINA", "YORK", "45091", "JOSEPH R BIDEN JR", "DEMOCRAT", 133, 144050, "PROVISIONAL"

It seems to me that there should be some way to loop over the FIPS codes and the party name to generate the totals for each county, but I'm stumped.

camille
  • 16,432
  • 18
  • 38
  • 60
eBerm
  • 13
  • 1
  • Lots of possibilities over here: https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group , if I'm interpreting the question correctly. – thelatemail Jan 20 '22 at 02:50

1 Answers1

0

Does this solve your problem?

library(tidyverse)

df <- read_csv("~/Desktop/countypres_2000-2020.csv")
#> Rows: 72617 Columns: 12
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (8): state, state_po, county_name, county_fips, office, candidate, party...
#> dbl (4): year, candidatevotes, totalvotes, version
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

df %>%
  filter(year == 2020) %>%
  group_by(candidate, county_fips) %>%
  summarise(
    county_name,
    total_votes_per_candidate_per_county = sum(candidatevotes)
    ) %>%
  relocate(candidate, .before = 4) %>%
  distinct() %>%
  arrange(county_fips)
#> `summarise()` has grouped output by 'candidate', 'county_fips'. You can override using the `.groups` argument.
#> # A tibble: 11,902 × 4
#> # Groups:   candidate, county_fips [11,898]
#>    county_fips county_name candidate         total_votes_per_candidate_per_coun…
#>    <chr>       <chr>       <chr>                                           <dbl>
#>  1 01001       AUTAUGA     DONALD J TRUMP                                  19838
#>  2 01001       AUTAUGA     JOSEPH R BIDEN JR                                7503
#>  3 01001       AUTAUGA     OTHER                                             429
#>  4 01003       BALDWIN     DONALD J TRUMP                                  83544
#>  5 01003       BALDWIN     JOSEPH R BIDEN JR                               24578
#>  6 01003       BALDWIN     OTHER                                            1557
#>  7 01005       BARBOUR     DONALD J TRUMP                                   5622
#>  8 01005       BARBOUR     JOSEPH R BIDEN JR                                4816
#>  9 01005       BARBOUR     OTHER                                              80
#> 10 01007       BIBB        DONALD J TRUMP                                   7525
#> # … with 11,892 more rows

Created on 2022-01-20 by the reprex package (v2.0.1)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • That did the trick. Thank you! – eBerm Jan 20 '22 at 12:43
  • You're welcome @eBerm, but please note that comments are not meant for 'saying thanks'. See [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers), specifically: "Please do not add a comment on your question or on an answer to say "Thank you". Comments are meant for requesting clarification, leaving constructive criticism, or adding relevant but minor additional information – not for socializing. If you want to say "thank you," vote on or accept that person's answer, or simply pay it forward by providing a great answer to someone else's question." – jared_mamrot Jan 20 '22 at 22:26