Returning more than one variable using Group By and summarize with Dplyr

Question

I'm trying to create a new column in my 2016 election dataset that shows whether the candidate lost or won a county.

 Democrat %>%
  group_by(county) %>%
  summarise(winningvote = max(fraction_votes))

This code only returns the max vote. Can I also return the candidate variable? Adding:

 select(county, fraction_votes, candidate)

Doesn't return anything different.

I'll attempt to create an "outcome" variable using mutate for the last line of the code. I was thinking the apply family might be another way to solve this.

Thanks

Is there a column called `candidate`? You should provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). You describe how you want to summarize the `fraction_votes`, but is there only one candidate per county? How do you want to summarize the candidate? — MrFlick, Feb 21 '17 at 20:34

score 1 · Answer 1 · answered Feb 21 '17 at 20:48

1

If the candidate is a field of the Democrat data frame, the simplest way is to do multiple grouping:

Democrat %>%
  group_by(county, candidate) %>%
  summarise(winningvote = max(fraction_votes))

answered Feb 21 '17 at 20:48

denrou

630
3
12

score 0 · Answer 2 · answered Feb 21 '17 at 20:57

I'm pretty confident there's a more succinct way to do this, but below will provide you a winning vote flag as 1. Then you simply replace NA with 0 (second block of code)

left_join(Democrat, (Democrat %>%
  group_by(county) %>%
  summarise(fraction_votes = max(fraction_votes)) %>%
  mutate(Winning_Vote = 1)))

Democrat[is.na(Democrat)] <- 0

Returning more than one variable using Group By and summarize with Dplyr

2 Answers2