0

I have a set of ics data that I am tryin to parse in python. This date uses emjois to indicate different types of events. So I am trying to use these emjois in an if statement tell what type of event it is. I am trying to compare like this:

if event == '✈️':
    do something here

When event equals a ✈️ it is not evaluating true. I'm guessing it has something to do with the encoding, but I can't wrap my head around it. Any help would be much appreciated

DasPete
  • 831
  • 2
  • 18
  • 37
  • I tried your `U+2708` code but got this error: `selfParse.py:32: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if event == U'2708'` I think I need to convert `event` to unicode somehow? – DasPete Oct 01 '18 at 05:24
  • Try it with the actual emoji and replace ' with " (Single quote, w/ double), I'm also seeing a few people using this `from emoji import UNICODE_EMOJI` –  Oct 01 '18 at 05:27
  • Tried it like this `if "✈️" == U"2708": print 'match found'` with the same result. The instersting thing is when I get rid of the quotes around the emoji it changes to a smaller plane and a question mark with a box around it in my text editor. If I try to copy to here it just changes right back to the emoji – DasPete Oct 01 '18 at 05:31
  • I think this is simply a syntax issue. I'm not terribly familiar w/ python emoji usage. I know it has something to do with the syntax/import. –  Oct 01 '18 at 05:35
  • Check this link out and see if it helps any. Essentially you need to find out how to validate an "emoji" or unicode. https://stackoverflow.com/questions/41604811/python-unicode-character-conversion-for-emoji –  Oct 01 '18 at 05:37
  • So this: `>>> b = '\U0001F600' >>> print b.decode('unicode-escape') ` works as expected. Now I just need to figure out how to run that in reverse – DasPete Oct 01 '18 at 05:41
  • I don't even need to compare emoji to emoji, if there was just some way to break it down into the actualy bytes and compare those that would do the trick too – DasPete Oct 01 '18 at 05:54
  • What about converting it to a string value? If you can convert it to a string value, assign it to a variable and then run your conditional check on that variable you can do it dynamically for more than one emoji. https://stackoverflow.com/questions/25707222/print-python-emoji-as-unicode-string –  Oct 01 '18 at 05:56
  • I tried this in the terminal and it just prints the original emoji: `string = str("✈️") print string` – DasPete Oct 01 '18 at 06:01
  • I'm going to have to try it some more tomorrow. But this `string = str("✈️") if string == str("✈️"): print 'match' else: print'no match'` Gives me a match. However when I try it with the data coming from the ics file I don't get any matches – DasPete Oct 01 '18 at 06:04
  • Not really an answer becuase it's not truely comparing emojis. But I was able to accomplish what I needed by doing this: `if string.find(str("✈️")) != -1: print 'flight found'` – DasPete Oct 01 '18 at 16:11
  • so getting it to a string did work. Awesome! –  Oct 01 '18 at 18:50
  • Seems like it. Though comparing to directly to the emoji didn't seem to work. I had to test against the find function like I did above – DasPete Oct 01 '18 at 20:25
  • What do you get from `print repr(event)` when there is no match? – Mark Tolonen Oct 04 '18 at 01:54

2 Answers2

1

That particular character is represented as two code points. In Python 2 you also need to declare the encoding of your source file to use non-ASCII in source and use Unicode strings in both the event and the item to compare:

#coding:utf8
event = u'\u2708\ufe0f'
if event == u'✈️':
    print 'match'

Output:

match

Your event might not be a Unicode string. Check type(event) and print repr(event) to see its actual content.

You can get non-Unicode strings to compare but they have to be encoded the same way. Again, print repr(event) is needed to see what the problem is. Ideally, decode input text into Unicode, process as Unicode in code, encode back to bytes to write the text back out to a database, file, network pipe, etc.

Also, switch to Python 3 which has much better Unicode handling.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
0

Try converting to a string first then encode that string.

#convert to unicode
teststring = unicode(teststring, 'utf-8')

#encode it with string escape
teststring = teststring.encode('unicode_escape')

#then run check on test string. 
if event == testString
  do #this code.
  • I actually do have the emoji module imported and comparing like you showed does not resolve true – DasPete Oct 01 '18 at 05:22
  • Ok, give me a sec. Going to give this a try myself and see if I can get it working. –  Oct 01 '18 at 05:23