0

I was curios if could see memory address of lists, arrays and strings in python ans came across something interesting and weird. Can someone explain what is going on ?

>>> l = [1,1,2,2,3,3]
>>> for i in range(6):
...     adr = str( id(l[i]) )
...     print(f'{l[i]}: {adr[-6:]}')
... 
1: 422400
1: 422400
2: 422432
2: 422432
3: 422464
3: 422464
>>> 

You can clearly see that elements with same value would in the same memory address accoring to id documentation:

CPython implementation detail: This is the address of the object in memory.

This happens with strings too The second weird thing happened with numpy arrays and i dont know if this happens to lists and strings.

>>> arr
array([1, 2, 3, 4])
>>> id(arr)
140318415946496
>>> id(arr[0])
140318415101680
>>> id(arr[1])
140318415101680
>>> id(arr)
140318415946496
>>> id(arr[0])
140318415101904
>>> id(arr[1])
140318415101904
>>> id(arr)
140318415946496
>>> id(arr[0])
140318415101680
>>> id(arr[1])
140318415101680
>>> 

Whenever i call id for the address of arr, the address of arr[0] changes.
Running this on python 3.8.0

  • What should be weird is the lists, not the array. Anyway, the behavior you are seeing with small integers and (likely) small strings are implementation details that arise due to internal optimizations. You should be surprised by that behavior. Numpy creates a new object whenever you index into it, since numpy.ndarray objects wrap primitive, numeric arrays – juanpa.arrivillaga Dec 08 '19 at 02:59

1 Answers1

0

It gets better, if you call arr[0] several times, it will give a different result each time. Here is why: While regular lists hold references to (Python) objects, numpy arrays store data internally as C arrays. In this case, whenever you try to query an item's identity, a new Python object has to be created first.

In general, the identity of integers is not meaningful, as explained in "is" operator behaves unexpectedly with integers: e.g.,

a,b = 256,256
# `is` = operator that compares object identity
a is b  # True
a,b = 257,257
a is b  # False

The behavior of is/id is meaningless in this case, as it depends on implementation details, compiler optimisations (the results can differ between running code in the interactive interpreter and scripts), etc. Here is a discussion of the numpy-related case: Comparing object ids of two numpy arrays.

Bottom line, this stuff is fun, but not helpful for the programmer, as the identity and the is operator shouldn't be used like that. The only two relevant cases for id/is are

  • to test if something is None, or
  • to check if mutating x will also mutate y, because both reference the same object.

(https://stackoverflow.com/a/25758019/9907994)

sammy
  • 857
  • 5
  • 13
  • So the explanation of elements with value having the same id is the same for this a and b example ? – Gabriel Borges Dec 08 '19 at 17:39
  • 1
    You mean your first example, the list `l = [1,1,2,2,3,3]`? Yes, because the current CPython implementation keeps an array of integer objects for all integers between `-5` and `256` (https://docs.python.org/3/c-api/long.html). Starting with `257`, new objects are generated. Concerning Numpy: A Numpy array is internally stored as a C array. Each entry in the array is just a number. Whenever you access an array item, Numpy creates a Python object from a number. The reason is that Numpy is optimised for efficient storage and vectorised operations on the array, not for individual element access. – sammy Dec 08 '19 at 20:02