0

I was wondering how I might go about removing punctuation from user input and creating a set from the words in the input. So far I have this.

input_set = set(self.entry.get().lower().split(' '))
musicfiend122
  • 23
  • 2
  • 7

3 Answers3

4

Use str.translate:

Python2:

>>> from string import punctuation
>>> s = 'sfda%$$sdafd dasf564%^%^, hgghg%#56'
>>> set(s.translate(None, punctuation).split())
set(['hgghg56', 'dasf564', 'sfdasdafd'])

Python3:

from string import punctuation
s = 'sfda%$$sdafd dasf564%^%^, hgghg%#56'
tab = dict.fromkeys(map(ord, punctuation))
print (set(s.translate(tab).split()))
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
4

This is an excellent place to use regular expressions:

import re
re.split(r'\W+',str)

depending on what you consider punctuation you may want to change '\W' to a different character class or character group.

Thayne
  • 6,619
  • 2
  • 42
  • 67
  • 1
    'Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.' – ktdrv Dec 03 '13 at 08:33
0
  1. Remove punctuation: see Best way to strip punctuation from a string in Python

  2. Create a set of words: set(sentence.split(' '))

Community
  • 1
  • 1
IceArdor
  • 1,961
  • 19
  • 20