1

Possible Duplicate:
Python “is” operator behaves unexpectedly with integers
Why (0-6) is -6 = False?

So, while playing with a bit with id (python 2.6.5), I noticed the following (shell session):

>>> a = 1
>>> id(a)
140524904
>>> b = 1
>>> id(b)
140524904

Of course, as soon as I modify one of the variables it gets assigned to a new memory address, i.e.

>>> b += 1
>>> id(b)
140524892

Is it the normal behavior to initially assign both variables that have identical values to the same memory location or just an optimization of i.e. CPython?

P.s. I spent a little time browsing around the code in parser, but couldn't find where and how variables are allocated.

Community
  • 1
  • 1
nvlass
  • 665
  • 1
  • 6
  • 15

4 Answers4

3

As mentioned by glglgl, this is an implementation detail of CPython. If you look at Objects/longobject.c in the source code for CPython (e.g. version 3.3.0), you'll find the answer to what's happening:

#if NSMALLNEGINTS + NSMALLPOSINTS > 0
/* Small integers are preallocated in this array so that they
   can be shared.
   The integers that are preallocated are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

This explains why, after a = 1; b = 1, a is b will be True, even when you say a += 2; b +=2; a -= 2; b -= 2. Whenever a number is calculated to have a value that fits in this array, the resulting object is simply picked from this array instead, saving a bit of memory.

You can figure out the bounds of this small_ints array using a function like this:

def binary_search(predicate, lo, hi):
    while lo + 1 < hi:
        mid = (lo + hi) / 2
        if predicate(mid):
            lo = mid
        else:
            hi = mid
    return lo

def is_small_int(n):
    p = n + 1
    q = n + 1
    return (p - 1) is (q - 1)

def min_neg_small_int():
    p, q = -1, -1
    if p is not q:
        return 0
    while p is q:
        p += p
        q += q
    return binary_search(is_small_int, p / 2, p) - 1

def max_pos_small_int():
    p, q = 1, 1
    if p is not q:
        return 0
    while p is q:
        p += p
        q += q
    return binary_search(is_small_int, p / 2, p)

def small_int_bounds():
    return (min_neg_small_int(), max_pos_small_int())

For my build (Python 2.7, 64-bit Windows build), small_int_bounds() == (-5, 256). This means that numbers between -5 and 256 (inclusive) are shared through the small_ints array in Objects/longobject.c.

-edit- I see elssar noted that there is a similar answer about interning of some literals. This fact is also mentioned in the documentation for PyInt_FromLong, as mentioned by this answer.

Community
  • 1
  • 1
  • thank you very much for your thorough answer (and also for the pointer in the python source -- it would take a while to spot this)! and thanks to all the other useful answers and comments! – nvlass Dec 20 '12 at 17:29
  • The only reason it took me a while to find it is because I was looking for something with `int` in its name. But once you find this source file, it's pretty much the first thing in there. That might be because, and I quote, "`/* XXX The functional organization of this file is terrible */`" ;) –  Dec 20 '12 at 18:20
2
  1. In python all variables are pointers to some objects. Even number.
  2. Number is immutable object. So, CPython not need to create a new object with the same value.
  3. This does not mean that CPython will always use the same objects.
  4. In your first example variables a and b point to the same object.
  5. When your make b += 1 you "create" new object 2.
defuz
  • 26,721
  • 10
  • 38
  • 60
  • +1 I knew that tuples and strings were immutable, but never occurred to me that numbers are too (or probably I missed that one). – nvlass Dec 20 '12 at 16:39
2

Here the term "variables" must be precised: there are objects at one hand, and names which are bound to objects at the other hand.

If you do a = b = 1, both a and b are bound to the same object representing 1.

If you do a = 1; b = 1, I think it is a CPython detail that it is the same. Generally, an implementation could choose to have two objects both representing 1 and using them both here. But as that would be a waste of memory, it is generally not done in this way.

glglgl
  • 89,107
  • 13
  • 149
  • 217
1

a and b both refer to the same object in memory (1), with the ID 140524904. Once you do b += 1 you have 2, which is located elsewhere.

snurre
  • 3,045
  • 2
  • 24
  • 31