1

I have a banned_word list banned=["things", "show","nature","strange image"], ,and I need to read a forum message txt file and replace all the banned words in the forum file by same length asterisks.

This is the forum file

Forum ONE 

Are things real?
What is nature?
Show me your image.
You have shown me a strange image, and they are strange prisoners.

The expected output is

Forum ONE 

Are ****** real?
What is ******?
**** me your image.
You have shown me a *************, and they are strange prisoners.

My actual output is

Forum ONE 

Are ****** real?
What is ******?
Show me your image.
You have ****n me a *************, and they are strange prisoners.

The banned word is case insensitive, so Show with capital S is counted as banned, but shown should not be counted as the banned word.

Below is my code

#banned_list
banned=["things","nature","strange image","show"]

#read message
with open("forum1","r")as f:
 message=f.readlines()

#append modified message in a new list
new_forum=[]
i=0
while i<len(message):
  j=0
  while j<len(banned):
    if message[i].__contains__(banned[j]):
        message[i]=message[i].replace(banned[j],len(banned[j])*"*")
        j+=1
    else:
        j+=1
  new_forum.append(message[i])
  i+=1

#write to a new_list
with open("new_forum1","w")as n:
i=0
while i<len(new_forum):
 n.write(new_forum[i])
 i+=1

Since this is a school homework, I am not allowed to use for and in. How should I modify my code?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Lu Yubing
  • 11
  • 2
  • you have a typo in your banned words: `straneg image` – ScottC Oct 22 '22 at 01:59
  • Special methods like __contains__ are not meant to be called directly. See [What does __contains__ do, what can call __contains__ function](https://stackoverflow.com/questions/1964934/what-does-contains-do-what-can-call-contains-function) – DarrylG Oct 22 '22 at 02:00
  • 1
    i would use `regex` for this type of problem – ScottC Oct 22 '22 at 02:02

1 Answers1

1

Have you been told that you're not to use re?

import re
banned_to_check = banned[:]
while banned_to_check:
    word = banned_to_check.pop()
    regex = r'(?i)\b' + word + r'\b'
    message = re.sub(regex, '*' * len(word), message)

It's a bit messier without regex, but you can search the lowercase version of the string, and then manually check the ends of word matches aren't letters.

def censor(string, banned):
    string = '¬' + string + '¬' # to prevent out of bounds checks later
    to_check = banned[:]
    while to_check:
        banned_word = to_check.pop()
        prev_idx = 0
        while (idx := string.lower().find(banned_word, prev_idx)) >= 0:
            pre = string[idx-1]
            post = string[idx + len(banned_word)]
            if not pre.isalpha() and not post.isalpha():
                string = string[:idx] + '*' * len(banned_word) + string[idx + len(banned_word):]
            prev_idx = idx + 1
    return string[1:-1] # remove out of bounds addtional characters
bn_ln
  • 1,648
  • 1
  • 6
  • 13