I am currently trying to do some sentiment analysis and I want to revert each word back into its original format. So I want each word belonging to a unique ID to be combined in a single row. So I want the opposite of unnest_tokens function. I have tried the following:
dsWords <- dsWords %>%
group_by(IDReview) %>%
summarize(text = str_c(word, collapse = " ")) %>%
ungroup()
However, I simply get all the words combined into 1 row, instead of a row for each unique ID. Can anyone help me out here? Below is a screenshot of what my data frame looks like and a subset of my data.
structure(list(IDReview = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L),
word = c("love", "love", "author", "side", "end", "show",
"one", "way", "think", "everyon", "also", "idea", "mani",
"amaz", "look", "mani", "idea", "think", "learn", "someth",
"dont", "know", "look", "fact", "see", "right", "dont", "write",
"review", "will", "hero", "will", "hes", "person", "tri",
"short", "certain", "never", "find", "like")), row.names = c("1",
"1.1", "1.2", "1.4", "1.6", "1.13", "1.14", "1.15", "1.16", "1.17",
"1.18", "1.19", "1.20", "1.24", "1.25", "1.27", "1.28", "1.30",
"1.33", "1.34", "1.35", "1.36", "1.37", "1.38", "1.39", "1.41",
"1.42", "1.44", "1.45", "2", "2.3", "2.5", "2.10", "2.12", "2.18",
"2.23", "2.26", "2.27", "2.30", "2.34"), class = "data.frame")