Python dunder method for `is`

Question

While looking through the docs, and specifically here http://docs.python.org/2/reference/expressions.html#is, I still can't find the dunder/protocol method that defines the implementation of the Python keyword is. What method determines it? From what I understand, all is does is to compare the results of the id function when called on two objects.

You are right; and `id` returns the memory location of the object. So unless you want to put your object in some specific location in RAM, you should never need to write your own `is` implementation. If you want to check for equivalence of two objects, use `__eq__` instead — inspectorG4dget, Aug 28 '13 at 19:21
I just learned the meaning of [dunder](http://wiki.python.org/moin/DunderAlias)! — Steven Rumbalski, Aug 28 '13 at 19:23
@StevenRumbalski Ha, upvote the thread to show appreciate :p — , Aug 28 '13 at 19:23
@inspectorG4dget: That's a little misleading, because it's only true for CPython. Jython or PyPy couldn't return a memory location even if they wanted to. — abarnert, Aug 28 '13 at 19:34
@abarnert: fair enough. I /was/ just referring to CPython - I should have made that clear — inspectorG4dget, Aug 28 '13 at 19:35

abarnert · Accepted Answer · 2013-08-28T19:33:20.517

11

There is no dunder method for is. You can't override it, and that's intentional. The whole point of is is that it tells you whether two expressions reference the same value. So it has to be false, by definition, for two different values. So there's no need to override it.

As the docs put it:

The operators is and not is test for object identity: x is y is true if and only if x and y are the same object.

(There's a little more in the Data model docs.)

Also, is doesn't compare the results of id.

id is just defined to return "an integer which is guaranteed to be unique and constant for this object during its lifetime". Which means is certainly could use id, but I don't know of any implementation where it does.

That being said, in CPython, it does effectively the same thing under the covers—is checks that the pointers are equal, while id casts the pointer into an integer and returns it. So the only difference between implementing it directly vs. implementing it via id would be an extra pair of function calls and a cast that would compile to no machine code…

But in other implementations, even that may not be true. (Which should be obvious, when you consider that Jython and PyPy are written in languages that don't even have such a thing as a pointer.) For example, in PyPy, is checks that the underlying RPython objects are the same, while id returns a key generated on the fly (and cached if you later call id on the same value).

edited Aug 28 '13 at 19:33

answered Aug 28 '13 at 19:21

abarnert

354,177
51
601
671

Wait, so there is no pure python implementation for `is`, its all in C? – Aug 28 '13 at 19:22
@EdgarAroutiounian: It's all in C in CPython. Of course it's in Java, .NET, or RPython in other Python implementations. – abarnert Aug 28 '13 at 19:23
Regardless of implementation, `id(a) == id(b)` *must* return the same result as `a is b`... would you agree? – nmclean Aug 28 '13 at 19:38
4

Nope. If `a` dies before `b` is created, they may have the same id despite being different objects. This isn't possible with `is`, since both objects must survive until the operator is evaluated. – user2357112 Aug 28 '13 at 19:39
@user2357112 Provided that `a` dies, I assume that means its gced, then how could `id` return anything at all for it? – Aug 28 '13 at 19:41
7

It can be collected between the evaluation of its ID and the creation of `b`. Consider the call `id([]) == id([])` (which returned `True` when I just tried it). The first list is created, then its ID is evaluated. At this point, there are no more references to it, and it can be GC'd. Then the second list is created, and it's free to reuse the just-collected object. – user2357112 Aug 28 '13 at 19:44
2

@EdgarAroutiounian: This problem can also happen if you store `id` numbers and compare then later. Correctly using `id` for anything that couldn't be done with `is` can be tricky; you have to ensure that you only ever compare `id` numbers of objects with overlapping lifetimes. – Ben Aug 28 '13 at 21:57
@user2357112 For both expressions, it's equally impossible for `a` and `b` to be garbage collected during their evaluations. In order to be bit by garbage collection, you would need to write `ida = id(a)` followed sometime later by `idb = id(b)`, and *then* evaluate `ida == idb`... but that's not what I asked ;) – nmclean Aug 29 '13 at 17:43
2

@nmclean: If `a` and `b` are variables, they won't be collected, but if they're placeholders for arbitrary expressions, `a` may be collectable after `id(a)` is evaluated. Try `id([]) == id([])` and see what you get. – user2357112 Aug 29 '13 at 17:46
@user2357112: Of course the fun thing is that Python doesn't _guarantee_ that they'll be collected as soon as they can be, or that the memory will be reused, or that the `id` has anything to do with the memory location. In CPython, this will almost always be `True` because the GC works by refcounting, `id` is memory, and the allocator tries to reuse recently released memory whenever possible; in most other implementations it will almost always be `False` because the GC won't free the value immediately, and `id` values are not based on memory and aren't reused as quickly. – abarnert Aug 29 '13 at 17:56
@user2357112 I understand that (and it's interesting, since I wouldn't have expected the `[]` to actually be gc'd that quickly), but it muddles up the essence of my point which is that an *id comparison gives you the same answer as `is` with the same two parameters*. Consider that `(id([]) == id([])) == ([] is [])` would be an invalid test because it involves *four* different parameters, not two. – nmclean Aug 29 '13 at 18:14
"With the same two parameters" - that assumes that you're performing both tests, rather than replacing an `is` test with an `id` comparison. To perform both comparisons, you need to make guarantees about object lifetime that don't exist if you replace `is` with `id`. It seems much more likely that you'd be interested in replacing one test with the other. – user2357112 Aug 29 '13 at 18:29
@nmclean: In CPython, the GC works by refcounting, so unless you have a reference cycle (which you obviously don't for an empty list display), objects almost always get cleaned up immediately. That's not true for the other implementations—but the values _could_ still get cleaned up in the middle of the expression, so if you write code that assumes they never will, your code is still wrong in PyPy/etc. Anyway, the key point is that `[] is []` requires both values to be alive simultaneously, until the `is` finishes, while `id([]) == id([])` does not. – abarnert Aug 29 '13 at 18:31
While we're at it, it's also worth noting that `id` is a regular function in `builtins`, while `is` is a built-in operator. Besides the almost-certainly-irrelevant fact that looking up `id` and calling it twice and then calling `int.__eq__` is going to be slower, there's also the potentially-critical fact that `id` can easily be shadowed by a global or local (or even monkeypatched in `builtins`)… – abarnert Aug 29 '13 at 18:40

Python dunder method for `is`

1 Answers1