I have a text variable and a grouping variable. I'd like to collapse the text variable into one string per row (combine) by factor. So as long as the group column says m
I want to group the text together and so on. I provided a sample data set before and after. I am writing this for a package and have thus far avoided all reliance on other packages except for wordcloud
and would like to keep it this way.
I suspect rle
may be useful with cumsum
but haven't been able to figure this one out.
Thank you in advance.
What the data looks like
text group
1 Computer is fun. Not too fun. m
2 No its not, its dumb. m
3 How can we be certain? f
4 There is no way. m
5 I distrust you. m
6 What are you talking about? f
7 Shall we move on? Good then. f
8 Im hungry. Lets eat. You already? m
What I'd like the data to look like
text group
1 Computer is fun. Not too fun. No its not, its dumb. m
2 How can we be certain? f
3 There is no way. I distrust you. m
4 What are you talking about? Shall we move on? Good then. f
5 Im hungry. Lets eat. You already? m
The Data
dat <- structure(list(text = c("Computer is fun. Not too fun.", "No its not, its dumb.",
"How can we be certain?", "There is no way.", "I distrust you.",
"What are you talking about?", "Shall we move on? Good then.",
"Im hungry. Lets eat. You already?"), group = structure(c(2L,
2L, 1L, 2L, 2L, 1L, 1L, 2L), .Label = c("f", "m"), class = "factor")), .Names = c("text",
"group"), row.names = c(NA, 8L), class = "data.frame")
EDIT: I found I can add unique column for each run of the group variable with:
x <- rle(as.character(dat$group))[[1]]
dat$new <- as.factor(rep(1:length(x), x))
Yielding:
text group new
1 Computer is fun. Not too fun. m 1
2 No its not, its dumb. m 1
3 How can we be certain? f 2
4 There is no way. m 3
5 I distrust you. m 3
6 What are you talking about? f 4
7 Shall we move on? Good then. f 4
8 Im hungry. Lets eat. You already? m 5