0

Right now I'm trying to summerize my rows, but keep all the values except the unique_id. Here is my current code:

library(dplyr)
rm_na_unique <- function(vec){
  unique(vec[!is.na(vec)])
}
trial <- test1 %>%
  group_by(unique_id) %>% 
  summarise_each(funs(toString(rm_na_unique(.))))

This removes duplicates, but how would I change the last part: summarise_each(funs(toString(rm_na_unique(.)))) to keep all values (whether it's all the same value) in my cell? Thank you.

Starting DF

Unique_id  Name  State
   1       Rich   PA
   1       Rich   PA
   1       Rich   PA
   2       Tim    DE
   2       Tim    DE
   2       Tim    DE

Desired Result

   Unique_id  Name               state
       1       Rich,Rich,Rich    PA,PA,PA
       2       Tim,Tim,Tim       DE,DE,DE

Based on a previous questions, I can see that I can accomplish this using the following code:

library(dplyr)

df %>%
  group_by(unique_id) %>%
  summarise(name=paste(name,collapse=','))

But how would I apply this to the entire data frame and not just one or two variables?

richiepop2
  • 348
  • 1
  • 12
  • `summarize_each` collapses groups to a single row. `mutate_each` keeps all the rows. – Gregor Thomas Aug 08 '16 at 17:57
  • I want to collapses into one row, but keep all values even if they're the same value -- see new added tables. Thank you. – richiepop2 Aug 08 '16 at 18:01
  • If you want to keep the dupes, why are you using a custom function that has `unique()` in it? Looks like you want `funs(paste(., collapse = ","))` – Gregor Thomas Aug 08 '16 at 18:14
  • `summarize_each` is only needed for multiple columns, for a single column `summarize()` is appropriate. Are you really summarizing multiple columns? The function you show, `rm_na_unique` removes missing and duplicate values. It turns out you want to keep duplicate values, and your example has no missing values. Do you still want to remove missing values? – Gregor Thomas Aug 08 '16 at 18:17
  • I want to keep duplicate values for all variables (except unique_id), not just one. I adjusted the tables again. Thank you. – richiepop2 Aug 08 '16 at 18:24
  • Just like in my comment, using `summarize_each()` with `funs()`. Complete code: `group_by(df, Unique_id) %>% summarize_each(funs(paste(., collapse = ',')))`. – Gregor Thomas Aug 08 '16 at 18:35
  • thank you @Gregor. Works perfectly and will certainly go back and read more on dplyr. Thank you. – richiepop2 Aug 08 '16 at 18:40

0 Answers0