python: how to remove words from a string

Question

I want to remove a list of words from a string.

for example: the list is :

["it's","didn't",isn't,"don't"]

the string is:

'it's a toy,isn't a tool.i don't know anything.'

What should I do to delete the it's , didn't, isn't, and don't from the string?

Can you be a little more specific? How are the string and list related? — cs95, Jun 29 '17 at 13:20
Are you saying that you want to delete every occurrence of the words in the list that appear in the string? Your wording is somewhat unclear. — Davy M, Jun 29 '17 at 13:21
@flystar In that case Jack Parkinson's answer should work, he shows you how to split and compare the string to the list. I'll tack on a suggestion in the comments to his answer. — Davy M, Jun 29 '17 at 13:49
@flystar Oh you are right I just read it over again and it's backwards, one moment I'll post a solution — Davy M, Jun 29 '17 at 14:05
@DavyM thanks,I have replace the word .But ,there is a result that \n is lost. How to get back \n — flystar, Jun 29 '17 at 14:50
@flystar I don't see the `\n` in the original example, but the second method I put (using the replace method on the string) should keep new lines, spaces, and so forth as the original string holds. If you want a new line at the end, you can finish it with final_string = final_string + "\n" — Davy M, Jun 29 '17 at 15:07

score 9 · Accepted Answer · answered Jun 29 '17 at 14:18

There are a few ways to go about doing this, and I'll address 2. One is to split up the string by words and compare word by word against the string that you want to remove words from. The other is to scan the string for each grouping of those characters. I'll give an example of each with their advantages and disadvantages.

The first way is to split the list by words. This is good because it goes over the whole list, and you can use a list comprehension to pull out just the values you want, however, as written it only splits on spaces, so it would miss anything that is touching punctuation. This question addresses how to avoid that problem so that this answer could work.

your_string = "it's a toy,isn't a tool.i don't know anything."
removal_list = ["it's","didn't","isn't","don't"]

edit_string_as_list = your_string.split()

final_list = [word for word in edit_string_as_list if word not in removal_list]

final_string = ' '.join(final_list)

The second option is to remove all instances of those terms in the string as is. This is good because it can avoid the punctuation problems, but it does have a drawback; if you remove something and it is part of another word, that part will be removed (For example, if you have a string with the word "sand" in it and try to remove "and" it will remove the "and" from "sand" and leave "s" in the string.)

your_string = "it's a toy,isn't a tool.i don't know anything."
removal_list = ["it's","didn't","isn't","don't"]

for word in removal_list:
    your_string = your_string.replace(word, "")

I hope one of these solutions meets your needs.

score 1 · Answer 2 · answered Jun 29 '17 at 13:24

1

Try this:

s = "it's a toy,isn't a tool.i don't know anything."
list = ["it's","didn't","isn't","don't"]

split_line = s.split()
for word in split_line:
    if word in list:
        list.remove(word)
output = ' '.join(list)

NB: this doesn't account for instances where words are in different cases of if they are up against punctuation, like yours is here: toy,isn't.

answered Jun 29 '17 at 13:24

Jack Parkinson

681
11
35

1

To account for those punctuation differences, [this question](https://stackoverflow.com/questions/1059559/split-strings-with-multiple-delimiters) addresses how to split on more than just whitespace. – Davy M Jun 29 '17 at 13:50

python: how to remove words from a string

2 Answers2