1

I was trying to create a program that removes all sorts of punctuation from a given input sentence. The code looked somewhat like this

from string import punctuation
sent = str(input())
def rempunc(string):
    for i in string:
        word =''
        list = [0]
        if i in punctuation: 
            x = string.index(i)
            word += string[list[-1]:x]+' '
            list.append(x)
    list_2 = word.split(' ')
    return list_2
print(rempunc(sent))

However the output is coming out as follows:

This state ment has @ 1 ! punc.

['This', 'state', 'ment', 'has', '@', '1', '!', 'punc', '']

Why isn't the punctuation being removed entirely? Am I missing something in the code?

I tried changing x with x-1 in line 7 but it did not help. Now I'm stuck and don't know what else to try.

  • What is the purpose of the `word` and `list` variables? – quamrana Apr 02 '22 at 20:02
  • `string[list[-1]:x]` starts at the last index where you found some punctuation, so you never remove anything. Note that using a list to keep track of the previous index is very clumsy, just use a variable. – Thierry Lathuille Apr 02 '22 at 20:07
  • See https://stackoverflow.com/questions/265960/best-way-to-strip-punctuation-from-a-string for efficient ways to remove punctuation. – Thierry Lathuille Apr 02 '22 at 20:08

1 Answers1

0

Repeated string slicing isn't necessary here.

I would suggest using filter() to filter out the undesired characters for each word, and then reading that result into a list comprehension. From there, you can use a second filter() operation to remove the empty strings:

from string import punctuation

def remove_punctuation(s):
    cleaned_words = [''.join(filter(lambda x: x not in punctuation, word))
                        for word in s.split()]
    return list(filter(lambda x: x != "", cleaned_words))
    
print(remove_punctuation(input()))

This outputs:

['This', 'state', 'ment', 'has', '1', 'punc']
BrokenBenchmark
  • 18,126
  • 7
  • 21
  • 33