How to make this code disregard all punctuation from the sentence?

Question

I've created this code to analyse an input sentence to allow for the user to search for a certain word within it. However, I can't seem to figure out how to make it so all the punctuation in the input sentence is disregarded. I need this because, if a sentence such as "hello there, friend" is input, the word "there" is counted as "there," and so if the user is searching for "there" it says it is not in the sentence. Please help me. I'm really new to python.

print("Please enter a sentence")
sentence=input()
lowersen=(sentence.lower())
print(lowersen)
splitlowersen=(lowersen.split())
print (splitlowersen)
print("Enter word")
word=input()
lword=(word.lower())
if lword in splitlowersen:
    print(lword, "is in sentence")
    for i, j in enumerate (splitlowersen):
        if j==lword:
            print(""+lword+"","is in position", i+1)    

if lword not in splitlowersen:
    print (lword, "is not in sentence")

score 0 · Answer 1 · answered Mar 02 '17 at 10:49

0

You could split the string on all punctuation marks:

s = "This, is a line."
f = s.split(".,!?")
>>>> f = ["This", "is", "a", "line"]

answered Mar 02 '17 at 10:49

Tim B

3,033
1
23
28

score 0 · Accepted Answer · answered Mar 02 '17 at 10:56

print("Please enter a sentence")
sentence=input()
lowersen=(sentence.lower())
print(lowersen)
splitlowersen=(lowersen.strip())
#to remove punctuations
splitlowersen = "".join(c for c in splitlowersen if c not in ('!','.',':'))
print("Enter word")
word=input()
lword=(word.lower())
if lword in splitlowersen:
    print(lword, "is in sentence")
    for i, j in enumerate (splitlowersen):
        if j==lword:
            print(""+lword+"","is in position", i+1)

if lword not in splitlowersen:
    print (lword, "is not in sentence")

Output:

Please enter a sentence
hello, friend
hello, friend
Enter word
hello
hello is in sentence

score 0 · Answer 3 · answered Mar 02 '17 at 11:03

This is a little long winded maybe but in python3.

# This will remove all non letter characters and spaces from the sentence
sentence = ''.join(filter(lambda x: x.isalpha() or x == ' ', sentence)
# the rest of your code will work after this.

There are a couple of advanced concepts in here.

Filter will take a function and an iterible returning a generator with the items that don't return true from the function https://docs.python.org/3/library/functions.html#filter

Lambda will create an anonymous function that will check each letter for us. https://docs.python.org/3/reference/expressions.html#lambda

x.isalpha() will check that the letter in question is actually a letter. followed by x == ' ' to see it it could be a space. https://docs.python.org/3.6/library/stdtypes.html?highlight=isalpha#str.isalpha

''.join will take the results of the filter and put it back into a string for you. https://docs.python.org/3.6/library/stdtypes.html?highlight=isalpha#str.join

score 0 · Answer 4 · edited May 23 '17 at 12:32

Or you could use nltk package for tokenizing your text which does the sentence tokenization as you would be expecting and it also avoids the common pitfalls of punctuation as 'Mr.' --> This will not be broken down based on the punctuation.

from nltk.tokenize import word_tokenize
string = "Hello there, friend"
words = word_tokenize(string)
print(words)

OUTPUT

['Hello', 'there', ',', 'friend']

So I guess you should try using nltk package and see if it works.

Click this link here for better understanding.

Hope this helps :)

How to make this code disregard all punctuation from the sentence?

4 Answers4