I have 4 words. They are wordA, wordB, wordX and wordY. I have a data set which consists of 1 column (message) and data type of message column is factor. I want to count the total number of occurrences of (wordX and wordY) and then subtracts it from occurrences of (wordA and wordB) in each row and then puting the result in a new column in the row.
For example if text of a message column is "wordD wordA wordX wordA wordC wordA wordB wordY" then the value should be equal to wordA-wordX+wordA+wordA+wordB-wordY= 1-1+1+1+1-1= +2 .
I wrote this code but it doesn't count duplicated words. I appreciate if you could help me.
for(i in 1:nrow(dataset){
counter=0
if(length(grep("wordA",dataset[i,1],)==1)){
counter=counter+1;
}
if(length(grep("wordB",dataset[i,1])==1)){
counter=counter+1;
}
if(length(grep("wordX",dataset[i,1])==1)){
counter=counter-1;
}
if(length(grep("wordY",dataset[i,1])==1)){
counter=counter-1;
}
dataset[i,2]=counter;
}