How does Python know the values already stored in its memory?

Question

I want to know how Python knows (if it knows) that a value-type object is already stored in its memory (and also knows where it is).

For this code, when assigning the value 1 for b, how does it know that the value 1 is already in its memory and stores its reference in b?

>>> a = 1
>>> b = 1
>>> a is b
True

>>> hex(id(b))'0x7ffe705ee350' >>> hex(id(a)) '0x7ffe705ee350' — Just A Lone, Apr 19 '19 at 03:03
If two variables refer to the same value between -5 and 256 (as opposed to use) then by definition there is only one object. — YusufUMS, Apr 19 '19 at 03:04
Python, the *language* doesn't specify this. This is how it is *implemented* in CPython. — Nishant, Apr 19 '19 at 04:07
@cs95. I don't think your duplicate is a good choice. OP clearly understands what `is` does and is asking about how caching works. As the answers show, the question is sufficiently clear and different from the duplicate, in my opinion. — Mad Physicist, Apr 19 '19 at 13:56

asn-0184 · Accepted Answer · 2019-04-19T06:18:31.077

Python (CPython precisely) uses shared small integers to help quick access. Integers range from [-5, 256] already exists in memory, so if you check the address, they are the same. However, for larger integers, it's not true.

a = 100000
b = 100000
a is b # False

Wait, what? If you check the address of the numbers, you'll find something interesting:

a = 1
b = 1
id(a) # 4463034512
id(b) # 4463034512

a = 257
b = 257
id(a) # 4642585200
id(b) # 4642585712

It's called integer cache. You can read more about the integer cache here.

Thanks comments from @KlausD and @user2357112 mentioning, direct access on small integers will be using integer cache, while if you do calculations, though they might equals to a number in range [-5, 256], it's not a cached integer. e.g.

pow(3, 47159012670, 47159012671) is 1 # False
pow(3, 47159012670, 47159012671) == 1 # True

“The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.”

Why? Because small integers are more frequently used by loops. Using reference to existing objects instead of creating a new object saves an overhead.

Just to make it clear: this is valid for the CPython interpreter. The language Python does not define this and other interpreters are free to have their own implementation. — Klaus D., Apr 19 '19 at 05:07
Also `10e5` is a float, not an int. (Also, not all small ints come from the small int cache. For example, on current CPython, `pow(3, 47159012670, 47159012671) == 1`, but `pow(3, 47159012670, 47159012671) is not 1`.) — user2357112, Apr 19 '19 at 05:37

score 5 · Answer 2 · answered Apr 19 '19 at 03:08

If you take a look at Objects/longobject.c, which implements the int type for CPython, you will see that the numbers between -5 (NSMALLNEGINTS) and 256 (NSMALLPOSINTS - 1) are pre-allocated and cached. This is done to avoid the penalty of allocating multiple unnecessary objects for the most commonly used integers. This works because integers are immutable: you don't need multiple references to represent the same number.

score 0 · Answer 3 · answered Apr 19 '19 at 03:02

0

Python doesn't know anything until you tell it. So in your code above, when you initialize a and b, you are storing those values(in the register or RAM), and calling the place to store it a and b, so that you can reference them later. If you didn't initialize the variable first, python would just give you an error.

answered Apr 19 '19 at 03:02

Andrew R.

72
7

1

I think you're missing the point of the question. `a == b` is obviously true. OP is asking why `a is b` is true. – Mad Physicist Apr 19 '19 at 03:03

How does Python know the values already stored in its memory?

3 Answers3