0

I have sentence and list look like this

sentence = 'hi how is the high school'
check_words = [
    'me',
    'you',
    'hi',
]

now I want to check the sentence and remove the check_words items in sentence
the output should be look like this

sentence = 'how is the high school'

but when I use this code it's remove the extra hi in high this is the code

temp = sentence.split()
for k in temp:
    if k in check_Words:
        sentence = sentence.replace(k, '')
  • Do not use `replace` to remove words repeatedly in a sentence, it is both inefficient and leads to errors. Instead parse the sentence word after word and chose to keep it or not (cf. [my answer](https://stackoverflow.com/a/69871770/16343464)) – mozway Nov 07 '21 at 11:23

4 Answers4

0

Do not replace the words, it is both inefficient as you need to go through the whole string at every step of the loop and error prone as you are not considering the words as isolated entities (thus your hi->high error). Instead check each word and include it or not in the final output.

You can use a list comprehension:

check_words = set(check_words)
new_sentence = ' '.join(word for word in sentence.split()
                        if not word in check_words)

Using your loop:

out = []
for k in sentence.split():
    if k not in check_words:
        out.append(k)
print(' '.join(out))

output: how is the high school

mozway
  • 194,879
  • 13
  • 39
  • 75
0
sentence = 'hi how is the high school'
check_words = [
    'me',
    'you',
    'hi',
]
sentence_list = sentence.split()
for i in sentence_list:
    for j in check_words:
        if j == i:
            sentence_list.remove(i)

sentence = " ".join(sentence_list)

This works, but there is probably a way to do this in one line

mozway
  • 194,879
  • 13
  • 39
  • 75
olavv
  • 3
  • 2
0

If replace() is not giving you the perfect answer, you can do this task using a new string variable. Create a new string variable. Now, using split() function, check every word of the sentence and compare it with list. If it is there in the list, don't add it to string variable. If it is not there in the list, add it. Your code:

new_sentence=""
check_words=["hi","me","you"]
sentence="hi how is the high school"
for i in sentence.split():
    if i not in check_words:
        new_sentence+=i+" "
print(new_sentence)
0

This happens because replace() replaces every old string with the new string. It also has an additional argument count to specify if you only want to remove a certain amount of the old string.

#sentence = sentence.replace(k, '')
sentence = sentence.replace(k, '', 1)   # this additional argument is sufficient

However, it would be better to do this with with list comprehension:

sentence = " ".join([word for word in temp if word not in check_words])

The above line is equivalent to:

sentence = []
for word in temp:
    if word not in check_words:
        sentence.append(word)
sentence = " ".join(sentence)
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Muhd Mairaj
  • 557
  • 4
  • 13