Two strings and unicodes that are exactly the same do not return true when evaluated with each other

Question

This makes no sense to me. The ultimate goal of what I need to do is break down a very long string into respective words and compare each word with something.. I am only posting a snippet of the total output not to overwhelm and clutter this post with unnecessary data.

I created a simple setup to compare if these two 'words' are the same. In raw form they come out as unicode, but evaluating them as unicode or converting them to string does not return True.. here is a sample output

u'altmer'/<type 'unicode'> | u'altmer'<type 'unicode'> | False

Here is the code that compares the two.

ch.msg("%r/%r | %r%r | %s" % (color_entrytxt_list[index].lower(),
                              type(color_entrytxt_list[index].lower()),
                              entry,
                              type(entry),
                              "True" if color_entrytxt_list[index].lower() == entry.lower() else "False"))


* note I have also tried is instead of '=='
** I have also tried converting each unicode to string via str(unicode)

I don't understand why this is evaluating to false?

EDIT:

Thank you everyone for helping with this question. As mentioned by a few folks, the problem actually did reside that the two strings were not structurally the same, the problem was some characters weren't outputted to the screen.

Possible duplicate of [python-unicode-equal-comparison-failed](http://stackoverflow.com/questions/18193305/python-unicode-equal-comparison-failed) — BPL, Aug 18 '16 at 01:05
Could be, but even == doesn't work so that post won't help me out. — Uys of Spades, Aug 18 '16 at 01:10
i tried it with '==' while statically defining all your variables and it worked for me. Could something else be wrong? — engineer14, Aug 18 '16 at 01:12
@Uys of Spades I have read your question and I haven't understood what you're trying to compare. For instance, `==` operator should be used to compare unicode string, example: `>>> u'foo'==u'foo' -> True` — BPL, Aug 18 '16 at 01:12
@UysofSpades a couple other suggestions would be to print the length and the hash of both unicode objects. There are characters which may not be visible when printed — engineer14, Aug 18 '16 at 01:17
@Uys of Spades This problem is really trivial but you haven't posted a [mcve](http://stackoverflow.com/help/mcve) question. In any case, I'd recomend you repr to print both unicode strings before the comparison with `==` so you'll realize why is giving you False. Believe me, `==` is the right choice here... So if it's giving False it's because those strings are not structurally equal — BPL, Aug 18 '16 at 01:30
@UysofSpades: The solution shouldn't go in your question, but in an answer (preferably with some more detail, like _what_ characters caused the problem). Later, after a waiting period, you can [accept your own answer](https://stackoverflow.com/help/self-answer), so that everyone sees this problem has been solved. — Kevin J. Chase, Aug 18 '16 at 03:52

Alexander · Answer 1 · 2016-08-18T01:12:19.197

0

You need to use == to check for equality instead of is which checks if they occupy the same memory location.

s1 = 'abc' * 10000
s2 = 'abc' * 10000

>>> id(s1)
4480540672

>>> id(s2)
4480570880

>>> s1 is s2
False

>>> s1 == s2
True

If the string is short enough, they will use the same memory location. Once it gets larger, however, they will each get there own.

edited Aug 18 '16 at 01:12

answered Aug 18 '16 at 01:10

Alexander

105,104
32
201
196

1

Perhaps you should include the string values that are causing the problems? – Alexander Aug 18 '16 at 01:14

Two strings and unicodes that are exactly the same do not return true when evaluated with each other

1 Answers1