I have translated Bengali phonetics into English. But after parsing, I got some trash characters, which I want to remove. My data frame looks like this.
col1
utto্tor
dokkho্shin
muuns্si
So I want to remove the trash character along with its previous and following character as well. For example: In the first row, I want to remove ্ - this character and also the character o and t, which is the adjacent of ্ (this) character.
My desired output is looks like the following-
col1 col2
utto্tor uttor
dokkho্shin dokkhhin
muuns্si muuni
P.S. I have got these kind of character by using Avro parser which looks like below:
reversed_text = avro.reverse("উত্তর")
print(reversed_text)
output: utto্tor
col0 col1
উত্তর utto্tor
দক্ষিণ dokkho্shin
মুন্সী muuns্si