9

I want to know how Python knows (if it knows) that a value-type object is already stored in its memory (and also knows where it is).

For this code, when assigning the value 1 for b, how does it know that the value 1 is already in its memory and stores its reference in b?

>>> a = 1
>>> b = 1
>>> a is b
True
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Just A Lone
  • 101
  • 8

3 Answers3

14

Python (CPython precisely) uses shared small integers to help quick access. Integers range from [-5, 256] already exists in memory, so if you check the address, they are the same. However, for larger integers, it's not true.

a = 100000
b = 100000
a is b # False

Wait, what? If you check the address of the numbers, you'll find something interesting:

a = 1
b = 1
id(a) # 4463034512
id(b) # 4463034512

a = 257
b = 257
id(a) # 4642585200
id(b) # 4642585712

It's called integer cache. You can read more about the integer cache here.

Thanks comments from @KlausD and @user2357112 mentioning, direct access on small integers will be using integer cache, while if you do calculations, though they might equals to a number in range [-5, 256], it's not a cached integer. e.g.

pow(3, 47159012670, 47159012671) is 1 # False
pow(3, 47159012670, 47159012671) == 1 # True

“The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.”

Why? Because small integers are more frequently used by loops. Using reference to existing objects instead of creating a new object saves an overhead.

asn-0184
  • 278
  • 1
  • 6
  • 6
    Just to make it clear: this is valid for the CPython interpreter. The language Python does not define this and other interpreters are free to have their own implementation. – Klaus D. Apr 19 '19 at 05:07
  • 2
    Also `10e5` is a float, not an int. (Also, not all small ints come from the small int cache. For example, on current CPython, `pow(3, 47159012670, 47159012671) == 1`, but `pow(3, 47159012670, 47159012671) is not 1`.) – user2357112 Apr 19 '19 at 05:37
5

If you take a look at Objects/longobject.c, which implements the int type for CPython, you will see that the numbers between -5 (NSMALLNEGINTS) and 256 (NSMALLPOSINTS - 1) are pre-allocated and cached. This is done to avoid the penalty of allocating multiple unnecessary objects for the most commonly used integers. This works because integers are immutable: you don't need multiple references to represent the same number.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
0

Python doesn't know anything until you tell it. So in your code above, when you initialize a and b, you are storing those values(in the register or RAM), and calling the place to store it a and b, so that you can reference them later. If you didn't initialize the variable first, python would just give you an error.

Andrew R.
  • 72
  • 7