Combine multiple character rows in R using group_by and summarize

Question

I have a dataset that looks like the following

Invoice Pizza Pasta Soda Cake  
 1        NA  pasta  NA   NA    
 1        NA   NA    NA  cake    
 2      pizza  NA    NA   NA    
 2        NA  pasta  NA   NA

I want to group it by Invoice and get an output as under

Invoice Pizza Pasta Soda Cake  
 1        NA  pasta  NA  cake   
 2     pizza  pasta  NA   NA

I'm trying to use the group_by(Invoice) %>% summarize() feature of dplyr but unable to get the desired output. Kindly suggest a good method, thanks!

Is there always only one non `NA` value per group in every column? — LAP, Feb 13 '19 at 07:38
See https://stackoverflow.com/questions/28036294/collapsing-rows-where-some-are-all-na-others-are-disjoint-with-some-nas — Cyrus Mohammadian, Feb 13 '19 at 07:41
@ LAP yes there is only one value other than NA. the column name is same as what the value will be in the row if it is not NA — ANP, Feb 13 '19 at 07:43
@Cyrus the link posted is helpful but my data is non-numeric. So how do i sum over the rows? — ANP, Feb 13 '19 at 07:46
There is a solution at linked post that works with non-numeric: https://stackoverflow.com/a/28036595/680068 — zx8754, Feb 13 '19 at 08:07

score 1 · Answer 1 · answered Feb 13 '19 at 07:55

library(dplyr)
df %>% group_by(Invoice) %>% 
       summarise_all(funs(sub('NA,|,NA','',paste(.,collapse = ','))))

# A tibble: 2 x 5
  Invoice Pizza Pasta Soda  Cake 
    <int> <chr> <chr> <chr> <chr>
1       1 NA    pasta NA    cake 
2       2 pizza pasta NA    NA

Combine multiple character rows in R using group_by and summarize

1 Answers1