4

I had some code that I was editing to make it more understandable at a glance and I thought I should change all the char=="|" to char is "|". I know that it looks like I'm dumbing it down too much but it does look better. Anyway, I decided to pycheck one last time and I got this warning:

Warnings...

test.py:7: Using is |, may not always work
Processing module test (test.py)...

For the life of me, I can't imagine a situation when "|" is "|" will return False unless you start venturing into multibyte character encodings, CJK characters and the like, if I'm not wrong. Is there some other situation that I've missed?

Plakhoy
  • 1,846
  • 1
  • 18
  • 30

3 Answers3

13

== will check if values on both the sides are equal, but is will check if both the variables are pointing to the same reference. So, they both are for entirely different purposes. For example,

a = "aa"
b = "aa"
print a, b, id(a), id(b)
print a == b
print a is b

Output on my machine

aa aa 140634964365520 140634964365520
True
True

since a and b are pointing to the same String data (strings are immutable in Python), python optimizes to use the same object. That is why is and == both are returning True. where as

a = "aa"
b = "aaa"[:2]
print a, b, id(a), id(b)
print a == b
print a is b

Output on my machine

aa aa 139680667038464 139680667014248
True
False

Though a and b have the same data (equal), in the memory they are stored in different places (different references or two different objects).

So, never use is operator to check for equality.

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • 2
    Minor nitpick: Memory storage location is more or less irrelevant. The relevant bit is that they are two separate objects. – Lennart Regebro Nov 11 '13 at 06:19
  • @LennartRegebro Thanks :) Added that also in my answer. – thefourtheye Nov 11 '13 at 06:30
  • 1
    I think Guido missed an optimization there. The interpreter should have recognized that the slice had already been interned and henced reused it... Anyway, It's clearer now. – Plakhoy Nov 11 '13 at 07:21
  • 2
    @Segfault, but isn't there a cost with doing that check _every_ time someone slices a string for the few times it'll match an existing string? Just because there is a reference to it doesn't mean it's interned either. – John La Rooy Nov 11 '13 at 08:59
7

In CPython, all the single character strings are interned and should always have the same id for the same string(character)

This is just an implementation detail and shouldn't be relied on.

There are a few places where you might want to check the identity of two strings is the same, but your use case is not one of them

Python 2.7.2 (1.9+dfsg-1, Jun 19 2012, 23:23:45)
[PyPy 1.9.0 with GCC 4.7.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``"let's not be obscure, unless we
really need to"''
>>>> a = "|"
>>>> a is "|"
False
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
1

I am interpreting the question here as "Where do I actually use 'is' to check things?".

The "is" check is typically used when a default named argument to a function is None and you want to see if something was given to the function. None is always None. If they give 0 or [] or '', those things evaluate as a boolean to False.

False == 0 == bool('') == bool([]) == bool(None)

returns

True

And so checking the bool of the variable won't give you the desired behavior if the user of your function wants the function to carry on like it normally would for such an argument. The None acts like a sentinel to check if the function got an argument that was intentionally provided. This check is of particular utility when you assign the variable to a mutable empty container object.

e.g.:

def foo(fn_list=None):
    if fn_list is None:
         fn_list = []
    fn_list.append('foo')
    return fn_list

So

a is b

is the same as checking

id(a) == id(b)

Which, if evaluates as True, they are the same object, because they both point to the same place in memory. As others have pointed out, don't use is to check for equality because implementation may vary, and semantically, you don't care if it is the same object, you care whether or not the two items are equivalent.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331