1

I would like to create a new dataframe based on an existing one. As the title suggests, I would like to paste all string values in a certain column, if a value in another column is equivalent.

Due to my poor writing skills, I think I'm not being very clear what I mean by this. To clarify, I've created an example.

Existing Dataframe

If I have something like this:

DF <- data.frame(
    ID = c(1,2,2,3,3,3,4,4,4,4),
    value = c("I","ate","cereals","for","breakfast","it","was","delicious","!!!",":)"))  

New Dataframe

I would like to create something like this:

DF2 <- data.frame(
    ID = c(1,2,3,4),
    value = c(paste("I"), paste("ate","cereals"), paste("for","breakfast","it"), paste("was","delicious","!!!",":)")))

All strings from column value are consolidated using paste when they have same values in column ID. I'm having troubles building a function that can do this. Could you please help me.

I am comfortable with either dplyr or data.table.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
wyatt
  • 371
  • 3
  • 13

2 Answers2

3

In dplyr you can use group_by with summarise

DF %>%
    group_by(ID) %>%
    summarise(value = paste(value, collapse = " "))
## A tibble: 4 x 2
#     ID value
#  <dbl> <chr>
#1    1. I
#2    2. ate cereals
#3    3. for breakfast it
#4    4. was delicious !!! :)
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
2

You can just group_by(ID) and summarise with a concatenation function. Here I use str_c with the collapse argument.

library(tidyverse)
DF <- data.frame(
  ID = c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4),
  value = c("I", "ate", "cereals", "for", "breakfast", "it", "was", "delicious", "!!!", ":)")
)

DF %>%
  group_by(ID) %>%
  summarise(value = str_c(value, collapse = " "))
#> # A tibble: 4 x 2
#>      ID value               
#>   <dbl> <chr>               
#> 1     1 I                   
#> 2     2 ate cereals         
#> 3     3 for breakfast it    
#> 4     4 was delicious !!! :)

Created on 2018-08-26 by the reprex package (v0.2.0).

Calum You
  • 14,687
  • 4
  • 23
  • 42