6

I know that is is used to compare if two objects are the same but == is for equality. From my experience is always worked for numbers because Python reuse numbers. for example:

>>>a = 3
>>>a is 3
True

And I'm used to using is whenever I compare something to a number. But is didn't work for this program below:

from collections import namedtuple
# Code taken directly from [Udacity site][1].
# make a basic Link class
Link = namedtuple('Link', ['id', 'submitter_id', 'submitted_time', 'votes',
                           'title', 'url'])

# list of Links to work with
links = [
    Link(0, 60398, 1334014208.0, 109,
         "C overtakes Java as the No. 1 programming language in the TIOBE index.",
         "http://pixelstech.net/article/index.php?id=1333969280"),
    Link(1, 60254, 1333962645.0, 891,
         "This explains why technical books are all ridiculously thick and overpriced",
         "http://prog21.dadgum.com/65.html"),
    Link(23, 62945, 1333894106.0, 351,
         "Learn Haskell Fast and Hard",
         "http://yannesposito.com/Scratch/en/blog/Haskell-the-Hard-Way/"),
    Link(2, 6084, 1333996166.0, 81,
         "Announcing Yesod 1.0- a robust, developer friendly, high performance web framework for Haskell",
         "http://www.yesodweb.com/blog/2012/04/announcing-yesod-1-0"),
    Link(3, 30305, 1333968061.0, 270,
         "TIL about the Lisp Curse",
         "http://www.winestockwebdesign.com/Essays/Lisp_Curse.html"),
    Link(4, 59008, 1334016506.0, 19,
         "The Downfall of Imperative Programming. Functional Programming and the Multicore Revolution",
         "http://fpcomplete.com/the-downfall-of-imperative-programming/"),
    Link(5, 8712, 1333993676.0, 26,
         "Open Source - Twitter Stock Market Game - ",
         "http://www.twitstreet.com/"),
    Link(6, 48626, 1333975127.0, 63,
         "First look: Qt 5 makes JavaScript a first-class citizen for app development",
         "http://arstechnica.com/business/news/2012/04/an-in-depth-look-at-qt-5-making-javascript-a-first-class-citizen-for-native-cross-platform-developme.ars"),
    Link(7, 30172, 1334017294.0, 5,
         "Benchmark of Dictionary Structures", "http://lh3lh3.users.sourceforge.net/udb.shtml"),
    Link(8, 678, 1334014446.0, 7,
         "If It's Not on Prod, It Doesn't Count: The Value of Frequent Releases",
         "http://bits.shutterstock.com/?p=165"),
    Link(9, 29168, 1334006443.0, 18,
         "Language proposal: dave",
         "http://davelang.github.com/"),
    Link(17, 48626, 1334020271.0, 1,
         "LispNYC and EmacsNYC meetup Tuesday Night: Large Scale Development with Elisp ",
         "http://www.meetup.com/LispNYC/events/47373722/"),
    Link(101, 62443, 1334018620.0, 4,
         "research!rsc: Zip Files All The Way Down",
         "http://research.swtch.com/zip"),
    Link(12, 10262, 1334018169.0, 5,
         "The Tyranny of the Diff",
         "http://michaelfeathers.typepad.com/michael_feathers_blog/2012/04/the-tyranny-of-the-diff.html"),
    Link(13, 20831, 1333996529.0, 14,
         "Understanding NIO.2 File Channels in Java 7",
         "http://java.dzone.com/articles/understanding-nio2-file"),
    Link(15, 62443, 1333900877.0, 1244,
         "Why vector icons don't work",
         "http://www.pushing-pixels.org/2011/11/04/about-those-vector-icons.html"),
    Link(14, 30650, 1334013659.0, 3,
         "Python - Getting Data Into Graphite - Code Examples",
         "http://coreygoldberg.blogspot.com/2012/04/python-getting-data-into-graphite-code.html"),
    Link(16, 15330, 1333985877.0, 9,
         "Mozilla: The Web as the Platform and The Kilimanjaro Event",
         "https://groups.google.com/forum/?fromgroups#!topic/mozilla.dev.planning/Y9v46wFeejA"),
    Link(18, 62443, 1333939389.0, 104,
         "github is making me feel stupid(er)",
         "http://www.serpentine.com/blog/2012/04/08/github-is-making-me-feel-stupider/"),
    Link(19, 6937, 1333949857.0, 39,
         "BitC Retrospective: The Issues with Type Classes",
         "http://www.bitc-lang.org/pipermail/bitc-dev/2012-April/003315.html"),
    Link(20, 51067, 1333974585.0, 14,
         "Object Oriented C: Class-like Structures",
         "http://cecilsunkure.blogspot.com/2012/04/object-oriented-c-class-like-structures.html"),
    Link(10, 23944, 1333943632.0, 188,
         "The LOVE game framework version 0.8.0 has been released - with GLSL shader support!",
         "https://love2d.org/forums/viewtopic.php?f=3&t=8750"),
    Link(22, 39191, 1334005674.0, 11,
         "An open letter to language designers: Please kill your sacred cows. (megarant)",
         "http://joshondesign.com/2012/03/09/open-letter-language-designers"),
    Link(21, 3777, 1333996565.0, 2,
         "Developers guide to Garage48 hackatron",
         "http://martingryner.com/developers-guide-to-garage48-hackatron/"),
    Link(24, 48626, 1333934004.0, 17,
         "An R programmer looks at Julia",
         "http://www.r-bloggers.com/an-r-programmer-looks-at-julia/")]


# links is a list of Link objects. Links have a handful of properties. For
# example, a Link's number of votes can be accessed by link.votes if "link" is a
# Link.

# make the function query() return a list of Links submitted by user 62443, by
# submission time ascending

def query():
    print "hello"
    print [link for link in links if link.submitter_id == 62443] # is does not work
    return sorted([link for link in links if link.submitter_id == 62443],key = lambda x: x[2])
query()

When I used is inside the query function like this [link for link in links if link.submitter_id is 62443] I'll get an empty list. But if I use ==, it worked fine.

For the most part, the code was directly taken from the udacity site but I also tried it on my local machine. The same result. So I think the numbers are now different objects in this case but why? Is there a need for this?

EDIT: Yes. I admit this question is duplicate and should be closed. But it's duplicate with the first post not the second. I didn't know that question before posting this.

My problem was that I thought number objects would always be reused.

Thanks to everyone, I got rid of a bad habit.

Community
  • 1
  • 1
Gnijuohz
  • 3,294
  • 6
  • 32
  • 47
  • 10
    "I'm used to using is whenever I compare something to a number". Then you should expect your code to break in mysterious ways. Python has never guaranteed that numbers are singletons - and sometimes they aren't. Get over it and use `==` as Guido intended ;-) – Tim Peters Dec 19 '13 at 05:56
  • http://stackoverflow.com/questions/2987958/how-is-the-is-keyword-implemented-in-python – Buddhima Gamlath Dec 19 '13 at 05:58
  • 4
    Like running across the road without looking. You know it's the wrong thing but it seems to have worked in the past so... – John La Rooy Dec 19 '13 at 06:04
  • @Gnijuohz: Try your number example with a number like 1023 instead of 3 and you'll see a `False` instead of `True`. – Matthias Dec 19 '13 at 06:04
  • 1
    @TimPeters I always used it to compare numbers like -1 or 0 which is used as return value for some functions. That's why they always worked. I see the point now! – Gnijuohz Dec 19 '13 at 06:24

6 Answers6

9

There's no answer to your question short of digging into details of the implementation of the specific version of Python you're using. Nothing is defined about whether a == b implies a is b when a and b are numbers. It's often true, and especially for "little integers", due to that CPython keeps a cache of little integer objects and usually (not always!) returns the same object for a given small integer value. But nothing about that is defined, guaranteed, or even always the same across releases.

Thinking about memory addresses can be mildly helpful, since that's how id() is implemented in CPython. But other implementations use different implementations for id(). For example, I was told that id() was a major pain to implement in Jython (Python implemented in Java), since Java is free to move objects around in memory during garbage collection (CPython does not: in CPython an object always occupies the memory originally allocated for it, until the object becomes trash).

The only intended - and supported - use for is is to check whether two names for objects in fact resolve to the very same object. For example, regardless of the type of b, after

a = b

it must be the case that

a is b

is True. And that's sometimes useful.

_sentinel = object() # create a unique object

def somefunc(optional=_sentinel):
    if optional is _sentinel:  # we know for sure nothing was passed
        ...

The other main use is for the handful of objects guaranteed to be singletons. None, True and False are examples of that, and indeed it's idiomatic to write:

if a is None:

instead of:

if a == None:

The first way succeeds if and only if a is in fact bound to the the singleton None object, but the second way may succeed if a is of any type such that a.__eq__(None) returns True.

Don't use is for numbers. That's insane ;-)

Tim Peters
  • 67,464
  • 13
  • 126
  • 132
  • Great answer. I had the impression that Python would always reuse objects for numbers. Now I know it's wrong. – Gnijuohz Dec 19 '13 at 06:22
  • Well, that would be extremely expensive. For example, after `a = 10**10000` and `b=10**10000`. After computing `10**10000` for the second time, Python would have to search through all numbers in existence to realize that `a` already had the same value. Not in our lifetimes ;-) – Tim Peters Dec 19 '13 at 06:27
  • I see your point! Thanks! – Gnijuohz Dec 19 '13 at 06:35
5

You are correct in that is checks identity, if the two variables are the same object, and that == is used to check equality, if the objects are equal. (What equal means is decided by the involved classes).

And you are correct that using is to check if two numbers are equal often works, because Python reuse numbers, so often when they are equal, they are also identical.

But you do notice how that sounds. Should you check if two numbers are equal, by using the identity check? No, of course not. You should use the equality check to check if objects are equal. It's as simple as that.

That you often can use the identity check to check the equality of numbers is just a side-effect of Python reusing numbers, which it does to save memory, and which is an implementation detail.

Besides, in Python 3 == 3.0, but 3 is not 3.0. So you should use == for that reason.

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
  • Yes I should use `==`. But I used is with small numbers and they always worked. it became a habit. Thanks! – Gnijuohz Dec 19 '13 at 06:16
1

There are two parts to this question

  1. is does not checks for equality, it just checks for identity. Two objects have same identity iff they are same objects (with same identity i.e. id(a) == id(b))

    >>> a = 10
    >>> b = a
    >>> id(a), id(b)
    (30388628, 30388628)
    
  2. CPython as Implemented (may be for others) certain numbers range of integers within a certain limit, they are cached so even though they are different objects, yet they have the same identity

Thus

>>> a is 200
True
>>> a = 2000
>>> a is 2000
False
Abhijit
  • 62,056
  • 18
  • 131
  • 204
  • So in short for big numbers `is` won't work. – Gnijuohz Dec 19 '13 at 06:15
  • 1
    @Gnijuohz So in short the reason it works sometimes is an internal implemetation detail of CPython. It's not expected to work, and you should not rely on it working. – Lennart Regebro Dec 19 '13 at 06:17
  • "So even though they are different objects, yet they have the same identity" - Wrong. It's the same object. Otherwise they would have different identities. – Lennart Regebro Dec 19 '13 at 06:18
  • @LennartRegebro ok, for small numbers CPython reused the same object but for big numbers it does not. Is this statement correct? – Gnijuohz Dec 19 '13 at 06:19
  • @Gnijuohz Yes. Although I think there might be cases when they are not the same, buy I don't remember when that was. It's irrelevant as it is an implementation detail you don't need to care about. – Lennart Regebro Dec 19 '13 at 06:23
0

The is operator is used to compare the identities of the two objects So basically You're comparing whether the objects have the same identity, not whether they are equal

So if you print the id's of the objects involved using the id() function their id's are different so the is operator doesn't work in this case:

>>>print [(id(link),id(62443)) for link in links if link.submitter_id == 62443]
[(28741560, 21284824), (28860576, 21284824), (28860744, 21284824)]

Although just because two objects are similar there identities might not be the same


Note: After an object is garbage collected, it's id is available to be reused. So the use of the is operator is actually somewhat discouraged

K DawG
  • 13,287
  • 9
  • 35
  • 66
  • why the downvote? BTW fixed if it causes confusion by any means @user2864740 – K DawG Dec 19 '13 at 06:01
  • @KDawG This is a _really_ bad way to do things mate. – Games Brainiac Dec 19 '13 at 06:09
  • The OP starts his question with "I know that is is used to compare if two objects are the same but == is for equality." So he knows this already. Just sayin'. – Lennart Regebro Dec 19 '13 at 06:12
  • 2
    Your example appears to prove the opposite of what you're explaining above. Short strings are interned (if they conform to certain rules). – Tim Pietzcker Dec 19 '13 at 06:12
  • 1
    *"Note: After an object is garbage collected, it's id is available to be reused. So the use of the is operator is actually somewhat discouraged"* - Eh. WAT? Who told you that? That makes no sense. Go laugh at them. – Lennart Regebro Dec 19 '13 at 06:14
  • @LennartRegebro well their id's are actually available to be reused so it might bring unexpected results right? – K DawG Dec 19 '13 at 06:15
  • @KDawG: No it might not. The object is only garbage collected when you no longer have any references to it. – Lennart Regebro Dec 19 '13 at 06:19
0

This is because is compares identities:

>>> a = 10
>>> id(a)
30967348
>>> id(10)
30967348
>>> a is 10
True
>>> a += 1
>>> a
11
>>> id(a)
30967336
>>> id(11)
30967336
>>> a is 11
True
>>> a = 106657.334
>>> id(a)
44817088
>>> id(106657.334)
31000752
>>> a is 106657.334
False
>>> a == 106657.334
True
Games Brainiac
  • 80,178
  • 33
  • 141
  • 199
0

is used to compare identity.

In [26]: a = 3

In [27]: a is 3
Out[27]: True

In [28]: id(a)
Out[28]: 140479182211448

In [29]: id(3)
Out[29]: 140479182211448

Extending same to the above example.

In [32]: for link in links:
    print id(link.submitter_id), id(62443), id(link.submitter_id) == id(62443), link.submitter_id

....:

140479184066728 140479184065152 False 60398
140479184066872 140479184065152 False 60254
140479184065688 140479184065152 False 62945
140479184064984 140479184065152 False 6084
140479184064648 140479184065152 False 30305
140479184063416 140479184065152 False 59008
140479184063608 140479184065152 False 8712
140479184063752 140479184065152 False 48626
140479184064352 140479184065152 False 30172
140479185936456 140479184065152 False 678
140479185966096 140479184065152 False 29168
140479184063752 140479184065152 False 48626
140479185936888 140479184065152 False 62443
140479184052336 140479184065152 False 10262
140479184061232 140479184065152 False 20831
140479185936888 140479184065152 False 62443
140479184057712 140479184065152 False 30650
140479185957880 140479184065152 False 15330
140479185936888 140479184065152 False 62443
140479185959760 140479184065152 False 6937
140479184061528 140479184065152 False 51067
140479184058728 140479184065152 False 23944
140479185944264 140479184065152 False 39191
140479184062568 140479184065152 False 3777
140479184063752 140479184065152 False 48626

Use is when checking for identity.

Ref: String comparison in Python: is vs. ==

Community
  • 1
  • 1
Kracekumar
  • 19,457
  • 10
  • 47
  • 56