-3

I have this string:

irak|"iraq"|"زلزال العراق"|"زلزال ايران"|"هزة ارضية"|"هزة ارضية ايران"|"زلزله"|"زلزله ايران"|"زمين لرزه ايران"|"iran news"|"iran quake"|"earthquake"|"Iran-Iraq border earthquake"

And the thing I want is to remove all the characters except English alphabets and numbers. Like for above string the result should be:

irak|iraq|iran news|iran quake|earthquake|Iran-Iraq border earthquake

And this should be done using regex in python.

PSEUDO
  • 113
  • 4
  • 13
  • Does this answer your question? [How can I remove non-ASCII characters but leave periods and spaces using Python?](https://stackoverflow.com/questions/8689795/how-can-i-remove-non-ascii-characters-but-leave-periods-and-spaces-using-python) – sushanth Aug 08 '20 at 09:29

1 Answers1

-1

Here's a way to do that in two steps:

s1 = re.sub(r'\|[^a-zA-Z]+\|', "|", s)
s2 = re.sub(r'"', "", s1)

The output (s2) is:

irak|iraq|iran news|iran quake|earthquake|Iran-Iraq border earthquake
Roy2012
  • 11,755
  • 2
  • 22
  • 35