1

I am looking to extract only chars from the given string. but my query is doing exactly opposite

s= "A man, a plan, a canal: Panama"

newS = ''.join(re.findall("[^a-zA-Z]*", s))
print(newS) // my o/p:  ,  ,  : 

expected o/p string is:

"A man a plan a canal Panama"

1 Answers1

0

Your regular expression is inverting the match - that's what the caret symbol (^) does inside square brackets (negated character class). You first need to remove that.

Next, you should be matching a sequence of one or more characters (+) rather than zero or more characters (*) -- using * will match the empty string, which you don't want in this case.

Finally your join should join with a space to get the intended output, rather than an empty string -- which won't retain the spaces between the words.

newS = ' '.join(re.findall(r'[a-zA-Z]+', s))

Though not essential in this case, its advised to use raw strings for regular expressions (r). More in this post.

Full working code:

import re

s = 'A man, a plan, a canal: Panama'

newS = ' '.join(re.findall(r'[a-zA-Z]+', s))

print(newS)
costaparas
  • 5,047
  • 11
  • 16
  • 26