I want to remove the rows which have the same two or more words after each other, like a sequence. This is to do a sequential pattern mining analysis.
I already tried the
distinct()
andduplicated()
function, but this only removes the whole row.
r_seq_5 <- r_seq_5[!duplicated(r_seq_5),] # remove duplicates
# Su Score result ROI next_roi third_roi four_roi five_roi
# 1 1 90 high Elsewhere Elsewhere Teacher Teacher Teacher
# 2 1 90 high Elsewhere Teacher Teacher Teacher Teacher
# 3 1 90 high Teacher Pen Teacher Elsewhere Smartboard
This is the table. If Teacher is two or three times in the sentence it doesn't matter, as long as it is not after each other.
The desired result is:
# 1 1 90 high Teacher Pen Teacher Elsewhere Smartboard