-2

I have the same problem that was discussed in this link Python extracting sentence containing 2 words

but the difference is that i need to extract only sentences that containe the two words in a defined size windows search . for exemple:

sentences = [ 'There was peace and happiness','hello every one',' How to Find Inner Peace ,love and Happiness ','Inner peace is closely related to happiness']

search_words= ['peace','happiness']
windows_size = 3  #search only the three words after the 1est word 'peace'
#output must be :
output= ['There was peace and happiness',' How to Find Inner Peace love and Happiness ']
tima
  • 3
  • 3
  • Explain a little bit clearly. Your output does not match the condition. – Dmitry Jun 20 '20 at 18:04
  • i don't see where the output doesn't match the condiction , for the 1st sentence in the output : 'happiness is in the 2nd position of the windows so it's satisfay the condiction , and in the 2nd sentence of the output : 'happiness is in the 3rd position and it satisfay the condiction so it's true . where for the last sentence in the input sentences : the word 'happiness' is in the 5th position so we don't take it . the window count start from the position i+1 . i hope it's more clear ? – tima Jun 20 '20 at 22:03

1 Answers1

0

Here is a crude solution.

def search(sentences, keyword1, keyword2, window=3):
    res = []
    for sentence in sentences:
        words = sentence.lower().split(" ")
        if keyword1 in words and keyword2 in words:
            keyword1_idx = words.index(keyword1)
            keyword2_idx = words.index(keyword2)
            if keyword2_idx - keyword1_idx <= window:
                res.append(sentence)
    return res

Given a sentences list and two keywords, keyword1 and keyword2, we iterate through the sentences list one by one. We split the sentence into words, assuming that the words are separated by a single space. Then, after performing a cursory check of whether or not both keywords are present in the words list, we find the index of each keyword in words to make sure that the indices are at most window apart, i.e. the words are close together within window words. We append only the sentences that satisfy this condition to the res list, and return that result.

Jake Tae
  • 1,681
  • 1
  • 8
  • 11