was thinking a good approach would be to convert the words linked list to a string.
Any time you have a list of X and a list of Y, and you want to check whether any of the elements in X are in Y, what you need is probably a hash set (not a list)
Hashsets offer fast lookups of fixed values. Your algorithm should be:
- load the list of searching-for into the set
- enumerate the list of searching-in, repeatedly asking if the current item is in the set
var hs = listOfWords.ToHashSet();
foreach(var sentence in listOfSentences){
foreach(var word in sentence.Split()){
if(hs.Contains(word))
{
...
}
}
}
or in a LINQ flavored approach
var hs = listOfWords.ToHashSet();
var result = listOfSentences.Where(sentence=>
sentence.Split().Any(word =>
hs.Contains(word)
)
);
Caution: c# hashing of strings is, be default, case sensitive and every character contributes to string equality. For a word list of "hello","world","foo","bar"
and a list of sentences of: "Hello world!", "Foo bar."
- these sentences do NOT contain any of the words in the word list. Hello
is not equal to hello
, world!
is not equal to world
. Carefully process your sentences so you are comparing apples with apples - e.g. strip punctuation, and make case equal, for example