I am trying to read emoticons in my sentences and assign the sentiment value into it. I have found a list of emoticons with its sentiment value and copy it to the CSV file with the emoticons Unicode value and sentiment value as shown below.
When I am trying to check whether the sentence has emoticons as below, it works:
if "\U0001f914" in sentence:
print("in")
But when I try to loop through the created CSV file (emoticons and sentiment) and check whether the emoticons exist in the sentence, it doesn't work.
Below is my code:
Method 1-
for line in lines_emoji:
senti_emoji_unicode, senti_emoji_score = line.strip().split(',')
if senti_emoji_unicode in sentence:
print("in")
Method 2-
for line in lines_emoji:
senti_emoji_unicode, senti_emoji_score = line.strip().split(',')
senti_emoji_unicode = '"'+senti_emoji_unicode+'"'
if senti_emoji_unicode in sentence:
print("in")
Below is the updated full code as per the answers
file_name_emoji = os.path.dirname(os.path.abspath(__file__)) + '/emoji sentiment.csv'
fo_emoji = open(file_name_emoji, 'r', encoding='utf-8')
lines_emoji = fo_emoji.readlines()
fo_emoji.close()
for line in lines_emoji:
senti_emoji_unicode, senti_emoji_score = line.strip().split(',')
emoji = senti_emoji_unicode.encode('utf-8').decode('unicode_escape')
score = float(senti_emoji_score)
if emoji in sentence:
print('--------------------------------------')
I am getting the error as 'unicodeescape' codec can't decode bytes in position 0-8: truncated \UXXXXXXXX escape. I have seen many posts related to this issue, like adding 'r' and changing ''. But these fixes cannot be apply in my scenario since I am using dynamic list. I have tried below scenario to set this with 'r'. But same error appears.
raw_s = "r'{0}'".format(senti_emoji_unicode)
raw = senti_emoji_unicode.encode('utf-8').decode('unicode_escape')