0

I'm trying to create a new column in my 2016 election dataset that shows whether the candidate lost or won a county.

 Democrat %>%
  group_by(county) %>%
  summarise(winningvote = max(fraction_votes))

This code only returns the max vote. Can I also return the candidate variable? Adding:

 select(county, fraction_votes, candidate)

Doesn't return anything different.

I'll attempt to create an "outcome" variable using mutate for the last line of the code. I was thinking the apply family might be another way to solve this.

Thanks

Andrew Lastrapes
  • 187
  • 3
  • 15
  • 2
    Is there a column called `candidate`? You should provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). You describe how you want to summarize the `fraction_votes`, but is there only one candidate per county? How do you want to summarize the candidate? – MrFlick Feb 21 '17 at 20:34

2 Answers2

1

If the candidate is a field of the Democrat data frame, the simplest way is to do multiple grouping:

Democrat %>%
  group_by(county, candidate) %>%
  summarise(winningvote = max(fraction_votes))
denrou
  • 630
  • 3
  • 12
0

I'm pretty confident there's a more succinct way to do this, but below will provide you a winning vote flag as 1. Then you simply replace NA with 0 (second block of code)

left_join(Democrat, (Democrat %>%
  group_by(county) %>%
  summarise(fraction_votes = max(fraction_votes)) %>%
  mutate(Winning_Vote = 1)))

Democrat[is.na(Democrat)] <- 0
Mark Druffel
  • 629
  • 4
  • 10