I am doing some basic NLP work in R. I have two data sets and want to replace the words in one with the cluster value of each word from the other.
The first data set holds sentences and the second one the cluster value for each word (assume that every word in first data set has a cluster value):
original_text_df <- read.table(text="Text
'this is some text'
'this is more text'", header=T, sep="")
cluster_df <- read.table(text="Word Cluster
this 2
is 2
some 3
text 4
more 3", header=T, sep="")
This is the desired transformed output:
Text
"2 2 3 4"
"2 2 3 4"
Looking for an efficient solution as I have long sentences and many of them. Thanks!