0

I have a list of 3, named listWords:

 $ : chr [1:6] "Maintenance" "repair" "installation" "activities" ...
 $ : chr [1:19] "Manufacture" "specific" "equipment" "energy" ...
 $ : chr [1:14] "Manufacture" "discharge" "lamps" "pressure" ...

I have another list, named wordsCP (for example)

$ : chr [1:3] "Cauliflowers" "and" "broccoli"
$ : chr "Lettuce"

and I would like to search the items in words CP that would contain at least 2 or 3 words from the listWords. How I can do that?

Indeed as a results, I should have a row number for both lists and then get that words in row 1 of listWords can be found in rows x, y or z of wordsCP.

macropod
  • 12,757
  • 2
  • 9
  • 21
IRT
  • 209
  • 2
  • 11
  • 2
    What is your expected result? I.e. do you want to maintain the list structure in wordsCP so that you check this per list element or just generally across all list elements? Same question for the input, yo you mean 2 out of 3 words per list element? – deschen Aug 08 '22 at 10:13
  • 2
    Also, please: https://stackoverflow.com/help/minimal-reproducible-example – deschen Aug 08 '22 at 10:13

1 Answers1

0

The below will give you a list with all elements of wordsCP that match 2 words or more in any single element of listWords

listWords <- list(c("please", "make", "a", "reprex"),
                  c("check", "https://stackoverflow.com/a/5965451/5224236"))

wordsCP <- list(c("a", "reprex"),
                c("will", "get", "you", "better", "answers"),
                c("check", "https://stackoverflow.com/a/5965451/5224236"))

match_matrix <- as.data.frame(sapply(wordsCP, function(x) sapply(listWords, function(y) sum(x %in% y)>=2)))

matches <- sapply(match_matrix, any)

wordsCP[matches]

[[1]]
[1] "a"      "reprex"

[[2]]
[1] "check"                                      
[2] "https://stackoverflow.com/a/5965451/5224236"
gaut
  • 5,771
  • 1
  • 14
  • 45