You could use re.sub()
to replace the desired character with a space and other characters.
The \\b
word boundary makes sure that ﻡ
is the first character in the word. The word boundary doesn't work well with Python2.7 and UTF-8, so you could check if there's a space or string beginning before your character.
# -*- coding: utf-8 -*-
import re
token = u'ﻢﻌﺠﺒﻨﻳ'
#pattern = re.compile(u'\\bﻡ') # <- For Python3
pattern = re.compile(u'(\s|^)ﻡ') # <- For Python2.7
print(re.sub(pattern,u'ﻡﺍ ', token))
It outputs :
ما عجبني
The english equivalent would be :
import re
pattern = re.compile(r'\bno')
text = 'nothing something nothing anode'
print(re.sub(pattern,'not ', text))
# not thing something not thing anode
Note that it automatically checks every word in the text.