Different lists in Python in size and content still share the id, does memory matter?

Question

I have read the answer from this question as well as the related questions about the issue of having different objects sharing the same id (which can be answered by this Python docs about id). However, in these questions, I notice that the contents of the objects are the same (thus the memory sizes are the same, too). I experiment with the list of different sizes and contents on both the IPython shell and .py file with CPython, and get the "same id" result, too:

print(id([1]), id([1,2,3]), id([1,2,3,4,1,1,1,1,1,1,1,1,1,1,1,1]))
# Result: 2067494928320 2067494928320 2067494928320

The result doesn't change despite how many elements or the size of the number (big or small) I add to the list

So I have a question here: when an id is given, does the list size have any effect on whether the id can be reused or not? I thought that it could because according to the docs above,

CPython implementation detail: This is the address of the object in memory.

and if the address does not have enough space for the list, then a new id should be given. But I'm quite surprised about the result above.

@12944qwerty question edited, I tried it both ways and the results are the same — Đào Minh Dũng, May 03 '21 at 17:10
None of your three lists *exist at the same time*. Whether or not they have the same ID is therefore utterly irrelevant. — jasonharper, May 03 '21 at 17:32
@jasonharper I see that the ID can still be reused, but what I want to ask here is that if the ID represent address in memory, there could be the case that the memory from that address is not enough for the newly created list, and it should be given new location ID. But I tried adding more and more, bigger and bigger numbers, nothing changed — Đào Minh Dũng, May 03 '21 at 17:38
The size of the list object itself (the thing that the ID is the address of) is constant. The actual contents of the list are in a separate block of memory, that you can't easily get the address of. So the sizes of your lists do not affect the possibility of an ID being reused. — jasonharper, May 03 '21 at 17:43
@jasonharper Thanks a lot, I guess I understand it now. If I'm correct, it is like the dynamically allocated array in heap memory in the C/C++ language, as it does not guarantee that the elements are contiguous? — Đào Minh Dũng, May 03 '21 at 17:48
@12944qwerty IPython *is CPython*. It is a REPL, not an alternate python implementation — juanpa.arrivillaga, May 03 '21 at 21:21

score 1 · Accepted Answer · answered May 03 '21 at 19:48

Make a list, and some items to it. the id remains the same:

In [21]: alist = []
In [22]: id(alist)
Out[22]: 139981602327808
In [23]: for i in range(29): alist.append(i)
In [24]: id(alist)
Out[24]: 139981602327808

But the memory use for this list occurs in several parts. There's some sort storage for the list instance itself (that's that the id references). Python is written in C, but all items are objects (as in C++).

The list also has a data buffer, think of it as a C array with fix size. It holds pointers to objects elsewhere in memory. That buffer has space for the current references plus some sort of growth space. As you add items to list, their references are inserted in the growth space. When that fills up, the list gets a new buffer, with more growth space. List append is relatively fast, with periodic slow downs as it copies references to the new buffer. All that occurs under the covers so that the Python programmer doesn't notice.

I suspect that in my example alist got a new buffer several times, but I don't there's any way to track or measure that.

Storage for the objects referenced by the list is another matter. cython creates small integer objects (up to 256) at the start, so my list (and yours) will have references to those unique integer objects. It also maintains some sort of cache of strings. But other things, such as larger numbers, other lists, dicts, custom class objects, are created as needed. Identically valued objects might well have different id.

So while the data buffer of the list is contiguous, the items referenced in the buffer are not.

By and large, that memory management is unimportant to programmers. It's only when data structures get very large that we need to worry. And that seems to occur more with object classes like numpy arrays, which have a very different memory use.

Different lists in Python in size and content still share the id, does memory matter?

1 Answers1

Linked