I am trying to concatenate rows of text by character in a data frame that looks something like this:
df <- data.frame(name = c("KYLE", "CARTMAN", "RANDY", "KYLE", "CARTMAN", "RANDY", "KYLE", "CARTMAN", "RANDY"),
lines = c("Hello", "Hello", "Hello", "my name is", "my name is", "my name is", "Kyle", "Cartman", "Randy"))
df <- data.table(df)
df
## name lines
## 1 Kyle Hello
## 2 Cartman Hello
## 3 Randy Hello
## 4 Kyle my name is
## 5 Cartman my name is
## 6 Randy my name is
## 7 Kyle Kyle
## 8 Cartman Cartman
## 9 Randy Randy
And my desired data frame should look like this:
df
## name lines
## 1 Kyle Hello my name is Kyle
## 2 Cartman Hello my name is Cartman
## 3 Randy Hello my name is Randy
After some research, I found a solution in Concatenate rows in a dataframe, but I can't figure out how to delete repeated rows:
df <- df[, newlines := str_c(lines, collapse = " "), by = name]
df
## name lines
## 1 Kyle Hello my name is Kyle
## 2 Cartman Hello my name is Cartman
## 3 Randy Hello my name is Randy
## 4 Kyle Hello my name is Kyle
## 5 Cartman Hello my name is Cartman
## 6 Randy Hello my name is Randy
## 7 Kyle Hello my name is Kyle
## 8 Cartman Hello my name is Cartman
## 9 Randy Hello my name is Randy
Perhaps there is some other way of concatenating rows so that I can avoid duplicates in the data frame?