1

My dataframe(df) looks like this:

    Comments  
-----------------
1 | comment1  
2 | comment2  
3 | comment3  
4 | comment4

...

I have created 2 lists are follows:

list1<-c("money","finance","aid")  
list2<-c("major","degree")    

I want to search through rows in a datframe which has comments from different persons. When any of the words in list1 are found in a particular row, counter1 should increment and when words in list2 are found counter2 should increment

I want to get results as:

counter1=10 ; counter2=25

Note: I don't wish to increment the counter at each frequency of words. For example, if a comment contains both "money" and "finance" the counter should increment only once. But if it has "money" and "major", counter1 and counter2 both should increment.

BENY
  • 317,841
  • 20
  • 164
  • 234

1 Answers1

0

You can collapse your list with |'s, so grepl will return TRUE if a match is found. Example:


Sample data

comments = data.frame(text=c("only list 1 since money","only list 2 since major","both lists money major","money finance list 1 once"))

                       text
1   only list 1 since money
2   only list 2 since major
3    both lists money major
4 money finance list 1 once

Code

list1<-c("money","finance","aid")  
list2<-c("major","degree")    

counter1=sum(grepl(paste(list1,collapse="|"),comments$text))
counter2=sum(grepl(paste(list2,collapse="|"),comments$text))

Result

counter1: 3
counter2: 2

Hope this helps!

Florian
  • 24,425
  • 4
  • 49
  • 80
  • it did not exactly work for my dataframe. On noticing the difference I found that the column is factor with 1 level only but it has 2351 rows in it. Can you help me solve that? – Darpit Dave Jul 31 '17 at 18:14
  • if your dataframe is `df`, and you column is called `txt`, do `df$txt <- as.character(df$txt)`. If that does not work it might be worth opening a new question about that issue, witha reproducible example as mentioned [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), please provide the sample data! – Florian Jul 31 '17 at 18:16