-2

I have string like below Which contains non ascii characters and other special characters:

 “Projected Set-tled Balan&ce†456$

How to remove all those unwanted characters and get a clean string like below which only has only small or capital alphabets and numbers.

  Project Settled Balance 456

I'm trying to achieve it with the help of regex [a-zA-Z0-9 ] I'm expecting a way to return string which matches this regex:

pat = re.compile('^[A-Za-z0-9 ]+')
stripped_string = string.strip().lower()
print(stripped_string)
print(pat.match(stripped_string))

But this is not returning anything.

crazycoder
  • 65
  • 1
  • 8

2 Answers2

1

This is not regex as you haven't asked it for before

''.join([i if ((i == " " )or (ord(i) < 128 and ord(i) >46)) else '' for i in '“Projected Set-tled Balan&ce†456$'])

Updated for regex

re.sub(r'[^A-Za-z0-9\s]+','', '“Projected Set-tled Balan&ce†456$')
Kartikeya Sharma
  • 1,335
  • 1
  • 10
  • 22
0

aString.encode('ascii', 'ignore')

My bad, that was pretty dumb of me

Do that but one letter at a time and if you get a error, replace that char with an empty string.

This was asked a lot, but here's these.

How to remove nonAscii characters in python

Replace non-ASCII characters with a single space