0

I'm trying to merge duplicates in this data frame in R

first last  city    in   out   
john  doe   sf      1    0
mary  jane  cl      0    1 
john  doe   sf      1    0
mary  jane  cl      1    0 
john  shmo  dn      1    0

to this result

john  doe   sf    2    0
mary  jane  cl    1    1 
john  shmo  dn    1    0
Phil
  • 7,287
  • 3
  • 36
  • 66
  • 2
    Using `dplyr`: `mydf %>% group_by(first, last, city) %>% summarize(in = sum(in), out = sum(out)) %>% ungroup()` – Phil Nov 03 '20 at 23:41

1 Answers1

0

I'm just going to shift Phil's answer from the comment to a actual answer. Using dplyr is probably the easiest way to solve the problem.

mydf %>% group_by(first, last, city) %>% 
summarize(in = sum(in), out = sum(out)) %>% 
ungroup()

Just don't forget the ungroup command. Neglecting that can lead to some nasty surprises later.

Dharman
  • 30,962
  • 25
  • 85
  • 135
bdempe
  • 308
  • 2
  • 9
  • what if the data has character data like the city that also needs to replace the NA? first last city in out john doe sf 1 0 mary jane cl 0 1 john doe NA 1 0 mary jane NA 1 0 john shmo dn 1 0 – Daniel Hodges Jun 03 '21 at 16:32