0

I'm trying to remove special character from Arabic String using it's Unicode which I got from this link: https://www.fileformat.info/info/unicode/char/0640/index.htm

This is my code:

TATWEEL = u"\u0640"
text = 'الســلام عليكــم'

text.replace(TATWEEL, '')
print(text)

But I tried it and doesn't work (it prints the same string without removing the character)

This is the special character 'ــ'

enter image description here

I'm using Python3

user8393084
  • 103
  • 3
  • 12
  • 3
    when you say it *doesn't work* what do you mean? Did you receive an error or the pc you were using exploded? – Adelin Feb 13 '18 at 05:47
  • No it prints the same text without removing the character. – user8393084 Feb 13 '18 at 05:48
  • 1
    Possible duplicate of [Replacing or substituting in a python string does not work](https://stackoverflow.com/questions/15780139/replacing-or-substituting-in-a-python-string-does-not-work) – DYZ Feb 13 '18 at 06:19

2 Answers2

8

The replace method of strings does not change the string it is called on; it returns a new string with the specified character replaced.

This code does what you want:

TATWEEL = u"\u0640"
text = 'الســلام عليكــم'

text2 = text.replace(TATWEEL, '')
print(text2)

To get the exact result you expected, use this:

text = text.replace(TATWEEL, '')
print(text)
cco
  • 5,873
  • 1
  • 16
  • 21
1

If text may contain multiple unicode elements then you should go for regex as below:

import re
TATWEEL = u"\u0640"
text = 'الســلام عليكــم'

unicode_removed_text = re.sub(TATWEEL, '', text)
Gahan
  • 4,075
  • 4
  • 24
  • 44
  • How is this better than `replace`? – Adelin Feb 13 '18 at 06:02
  • refer this [ink](https://stackoverflow.com/questions/5668947/use-pythons-string-replace-vs-re-sub); it's very good explanation on where to use replace() and where re.sub() – Gahan Feb 13 '18 at 06:06
  • 1
    First answer says *if you can use `replace`, use it* – Adelin Feb 13 '18 at 06:07
  • I still don't understand the rationale from your post "*If text may contain multiple unicode elements then you should go for regex*" – Adelin Feb 13 '18 at 06:08