Your regular expression is inverting the match - that's what the caret symbol (^
) does inside square brackets (negated character class). You first need to remove that.
Next, you should be matching a sequence of one or more characters (+
) rather than zero or more characters (*
) -- using *
will match the empty string, which you don't want in this case.
Finally your join
should join with a space to get the intended output, rather than an empty string -- which won't retain the spaces between the words.
newS = ' '.join(re.findall(r'[a-zA-Z]+', s))
Though not essential in this case, its advised to use raw strings for regular expressions (r
). More in this post.
Full working code:
import re
s = 'A man, a plan, a canal: Panama'
newS = ' '.join(re.findall(r'[a-zA-Z]+', s))
print(newS)