I have a dataframe, I want to get weights by DTM
or TDM
of every word in a sentence. Out of those weights I want to get the maximum weight along with the word which carries that weight and then I want to apply calculation on each word weight.
My dataframe is given below:
text
1. miralisitin manzoorpashteen
2. She is best of best.
3. Try again and again.
4. Beware of this woman. She is bad woman.
5. Hold! hold and hold it tight.
I want it to be like:
text wordweight maxword maxcount
1. miralisitin manzoorpashteen 1 1 NA NA
2. She is best of best. 1 1 2 1 best 2
3. Try again and again. 1 2 1 again 2
4. Beware of this woman. She is bad woman. 1 1 1 2 1 1 1 woman 2
5. Hold! hold and hold it tight. 3 1 1 1 hold 3
How will I do this?
I have tried this using quanteda
library but won't get the result as its dfm()
function works on corpus not on dataframe. It can also be done by using tm
library DTM
or TDM
but not like this.