2

I spent about four hours researching the "UnicodeWarning: Unicode unequal comparison" issue. Usually, after a few hours, I'm able to answer my trickiest questions by myself, but that wasn't the case here. And I mean "tricky" for myself, of course. ;-)

I know that similar questions are answered online and also on this site, but being too noob to understand the answer well doesn't help me at all. Maybe the best way for me to get it is just having someone point out what needs to be changed in my code.

I use Python 2.5 on Windows XP.

What I was able to figure out

I understand that my problem has to do with me trying to compare apple and oranges (or Unicode and ASCII, or something like that, like maybe bytes). What I don't know is a practical way to solve this.

Here is my code:

# coding: iso-8859-1
import sys
from easygui import *

actual_answer = "pureté"
answer_given = enterbox("Type your answer!\n\nHint: 'pureté'")

if answer_given == actual_answer:
    msgbox("Correct! The answer is 'pureté'")
else:
    msgbox("Bug!")

Here is the error message I get:

UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

Dayan
  • 7,634
  • 11
  • 49
  • 76
user1975126
  • 51
  • 1
  • 5
  • If it where Python 3 the string in the variable would already be unicode, and no library would returna bytestring. This is Python 2. – jsbueno Jan 14 '13 at 00:47
  • Either way OP states clearly that hes on Python 2.5 on Windows XP... – Dayan Jan 14 '13 at 00:48

2 Answers2

1

First, read this: http://www.joelonsoftware.com/articles/Unicode.html

Then - you should not really use iso-8859-1 encoding when dealing with Python in whatever system - use utf-8 instead.

Third, your easygui component is returning you a unicode object instead of a byte-string. The easiest way to fix that in the above code is to make the actual_answer variable an unicode object, but prefixing an "u" to the quotes, like in:

actual_answer = u"pureté"
Dayan
  • 7,634
  • 11
  • 49
  • 76
jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • Thank you very much. I read the article and the obvious conclusion is that UTF-8 is what I need here. So I did what you suggested and my program now works. – user1975126 Jan 14 '13 at 04:43
  • What if instead the answer is stored in an array? For instance: "actual_answer = answer_list[random_choice][1]"? How do I convert that to Unicode? – user1975126 Jan 14 '13 at 04:52
0

Here's a function to return proper utf-8 formatting:

  def utf8(str):
      return unicode(str, 'latin1').encode('utf-8')

Also, have you tried using unicode escapes?

print "puret\u00E9".decode("unicode_escape")

For example you can apply this to your code as so:

# coding: iso-8859-1
import sys
from easygui import *

actual_answer = "puret\u00E9".decode("unicode_escape")
answer_given = enterbox("Type your answer!\n\nHint: " + actual_answer)

if answer_given == actual_answer:
    msgbox("Correct! The answer is " + actual_answer)
else:
    msgbox("Bug!")

Refer to Python docs for more detailed information on Unicode Escapes. http://docs.python.org/2/howto/unicode.html

Dayan
  • 7,634
  • 11
  • 49
  • 76