I am having difficulty merging two datasets; both from the FEC. When I attempt to right-merge I get a many-to-many relationship and the merged dataset keeps duplicating numbers in all the wrong numbers.
What I tried:
# 2014 US House Election Results
Election_2014 <- read_excel("2014 US House Election by State.xlsx", na = c("", "NA"))
# 2018 US House Election Results
Election_2018 <- read_excel("2018 US House Election by State.xlsx", na = c("", "NA"))
# 2014 and 2018 Election Results
Election_14.18 <- Election_2014 %>% right_join(Election_2018, by=c("STATE", "DISTRICT", "PARTY"))
What I wanted to happen:
# Election_2014
`1` `STATE ABBREVIATION` STATE DISTRICT `(I)` `TOTAL VOTES` PARTY `GENERAL VOTES` `GENERAL %`
1 2 AL Alabama 01 (I) NA R 103758 0.682
2 3 AL Alabama 01 NA NA D 48278 0.317
3 4 AL Alabama 01 NA NA W 198 0.00130
4 5 AL Alabama 01 NA District Votes: NA 152234 NA
5 6 AL Alabama NA NA NA NA NA NA
6 7 AL Alabama 02 (I) NA R 113103 0.673
to merge with:
# Election_2018
`1` `STATE ABBREVIATION` STATE DISTRICT `(I)` `TOTAL VOTES` PARTY `GENERAL VOTES` `GENERAL %`
1 3 AL Alabama 01 (I) NA R 153228 0.632
2 4 AL Alabama 01 NA NA D 89226 0.368
3 5 AL Alabama 01 NA NA D NA NA
4 6 AL Alabama 01 NA Party Votes: D NA NA
5 7 AL Alabama 01 NA NA W 163 0.000672
6 8 AL Alabama 01 NA District Votes: NA 242617 NA
What I got:
`1.x` `STATE ABBREVIATION.x` STATE DISTRICT `(I).x` `TOTAL VOTES.x` PARTY `GENERAL VOTES.x` `GENERAL %.x`
1 2 AL Alabama 01 (I) NA R 103758 0.682 NA
2 3 AL Alabama 01 NA NA D 48278 0.317 NA
3 3 AL Alabama 01 NA NA D 48278 0.317 NA
4 3 AL Alabama 01 NA NA D 48278 0.317 NA
5 4 AL Alabama 01 NA NA W 198 0.00130 NA
6 5 AL Alabama 01 NA District Votes: NA 152234 NA NA
For 12,000 rows :/
I'm wanting just one 'D' and one 'R' per district -- if you can help me get rid of the District votes too, even better. I've only been using R for a month, so any words of advice would be greatly appreciated.