I am doing my project in linguistics (Language is Malayalam).
My list is
x= [u'1\u0d30\u0d3e\u0d2e\u0d28\u0d4d\u200d', u'5\u0d05\u0d35\u0d28\u0d4d\u200d']
I want to extract the integer and unicodes from each item in the list.
The expected output is
1 \u0d30\u0d3e\u0d2e\u0d28\u0d4d\u200
5 \u0d05\u0d35\u0d28\u0d4d\u200d
First i tried to convert the first item x[0] into ascii
print unicodedata.normalize('NFKD',x[0]).encode('ascii','ignore')
the output is 1 .
I think this output is generated because the unicode in list is for malayalam.
Then i tried to find the first index of "\u" like
x[0].index("\u")
Error occurred by doing this.