4

I recently learned that when you delete a list in Python, the reference of this list gets saved up within an array and gets popped out when you initialize a new list.

I ran this in my regular interpreter:

l = [1,2,3]
l_id = id(l)
del l
g = [1,2,3]
id(g) == l_id # True

And as expected I got the right result.

I tried the same thing on my IPython interpreter, and got False instead. Why does it happen? Is it better?

Python version: v3.7.0:1bf9cc5093

Ipython version: 7.5.0

Update

Its happend also with different lists:

l = [1,2,3]
l_id = id(l)
del l
g = [1,2,3,4,5,6,7,8]
id(g) == l_id # True

And its always happend, its not just a random thing that they gets the same reference

Update 2

I do know why this happend, i just want to know why its happend only on pure python interpreter and not on my ipython, and which one of those method is better for memory management

Update 3

As i can explain the reason that those list have the same id, i can not get why it is difference between ipython and python.

Look at the implemetation of List listobject.c.

As we can see there is an array of references, called free_list. which the values of the array is the destroyed list objects, and the count numfree for the array indexing. we can see that if there is more then 80 list deleted, the next one wont be saved in the array. so from those line we can say that my statement can be always true for any new python interpreter.

but i still can not find a reason for ipython to work like this

Community
  • 1
  • 1
Reznik
  • 2,663
  • 1
  • 11
  • 31
  • 7
    The `id()` of a deleted object is meaningless. The same id may, or may not, be assigned to a future object. – jasonharper Sep 24 '19 at 20:28
  • 1
    In addition, `g` and `l` having the same contents is irrelevant, so that's a bit of a red herring. – ggorlen Sep 24 '19 at 20:36
  • @jasonharper it does happend all the time, i can not say it is random i can do a loop on this process, and always get the same id – Reznik Sep 24 '19 at 20:46
  • @wjandrea i search for the difference between how Cpython and ipython manage the memory, and why it happend with lists only (as long as i know) – Reznik Sep 24 '19 at 20:47
  • 4
    "Happens all the time" is one of the infinite possibilities covered by "may or may not happen". Among the Python implementations where this is at all *likely* to happen, I suspect the difference you're seeing is that IPython is executing more internal Python code after each interactive statement, and that some internally-created object happens to be the one that gets the same id as the deleted object. – jasonharper Sep 24 '19 at 20:51
  • 1
    FWIW, I can reproduce the same behaviour with a dict, though in different versions of CPython and IPython. – wjandrea Sep 24 '19 at 20:52
  • 1
    @jasonharper That was my initial guess too, but I disproved it by executing the entire 5 lines in block mode. – wim Sep 24 '19 at 20:54
  • Anyway, ultimately you're asking about undefined behaviour, so it shouldn't be surprising that you get different results in different implementations. – wjandrea Sep 24 '19 at 20:56
  • [id](https://docs.python.org/3.5/library/functions.html#id): Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. **Two objects with non-overlapping lifetimes may have the same id() value.** Emphasis mine. – solarc Sep 24 '19 at 21:02
  • @wjandrea as i can see here: [link](https://github.com/python/cpython/blob/3c87a667bb367ace1de6bd1577fdb4f66947da52/Objects/listobject.c) There is a counter for those list, and use the freed list id, and the max cached values of this is 80 – Reznik Sep 24 '19 at 21:06
  • 1
    `ipython` with its `IN`, `OUT` and '___' stacks might start with a larger 'working memory'. But you are working in the wrong language if you feel a need to micro manage memory use, especially for base objects like lists and dicts. For large `numpy` arrays, that can occupy MB of contiguous memory blocks, you sometimes need to pay attention to memory use and the number of copies. But lists are dispersed objects. – hpaulj Sep 24 '19 at 21:07
  • @solarc it is true, but why does it happend with lists (and more objects) in pure python and not within ipython? – Reznik Sep 24 '19 at 21:08
  • @hpaulj it may be true, but does ipython really modify the Cpython code? i ask this question just for general knowledge by the way. – Reznik Sep 24 '19 at 21:11
  • Try `for _ in range(10): print(id([1,2,3]))`. How many repeats do you get? – hpaulj Sep 24 '19 at 21:11
  • It's not a matter of modifying code. Memory, and `id` allocation, is a runtime process, If you run the same script several times, in either environment, I doubt if you'll get the same `id`. – hpaulj Sep 24 '19 at 21:14
  • @hpaulj your code return the same result, which are difference within python and ipython. isn't it weird that ipython rewrite the id assigning of objects? you saying ipython throw away the caching of ids – Reznik Sep 24 '19 at 21:18
  • IPython is not distinct from CPython. CPython is an implementation of the Python language in C, indeed, it is the reference implementation. IPython is an enhanced interactive REPL that uses CPython. – juanpa.arrivillaga Sep 24 '19 at 21:41
  • 1
    "And as expected I got the right result." *Why would you expect this*??? This is the result of various optimizations, critically, optimizations that are being masked by the IPython REPL keeps various references to objects around in a more enhanced "history". What you are seeing there should be *suprising* not expected – juanpa.arrivillaga Sep 24 '19 at 21:43
  • Interestingly, if you put it in a script file and run it in IPython you see again the same memory re-use. So it is something about the IPython interactive REPL specifically, creating extra objects behind the scenes for whatever IPython feature. – wim Sep 24 '19 at 21:51
  • 1
    @wim yes, because when you `del` that name in the IPython repl, there are still many references kept by the repl, i.e. in the `IN` and `OUT` maps, and I believe there are several magic underscore variables liek `_` and `__` etc. So its not so much creating extra objects, rather, it is keeping the same object around much longer. This is commonly a problem encountered when people use it for interactive data-analysis with, say, pandas. – juanpa.arrivillaga Sep 24 '19 at 21:52
  • `_` and `__` are not stored for del or assignment statements. – wim Sep 24 '19 at 21:53
  • @wim true, I'm not sure then that it's because of the extra references. – juanpa.arrivillaga Sep 24 '19 at 22:00
  • 1
    @Reznik Something in iPython probably still holds a reference to the value of the variable for usage with it's %magic or history commands. The question would be then, do you depend on this functionality? and if so, why? – solarc Sep 24 '19 at 22:02

1 Answers1

4

First, id() doesn't have anything to do with general memory management in Python the language. However, in CPython (hence the IPython REPL) it has a one-to-one correspondence with raw memory locations. Some comments pointed out that the question doesn't necessarily make sense in the abstract, but restricted to IPython and the standard CPython REPL it seems applicable.

All that's happening is that in the processing of your cell block the IPython environment creates a few extra objects (including list objects) behind the scenes. Since the original memory space from l is taken for some of the lists that IPython created behind the scenes, the CPython allocator finds a new block of memory for g.

For some evidence of the extra objects, consider the following experiment run in both CPython and IPython that introspects the garbage collector.

from gc import get_objects
orig = set(map(id, get_objects()))
l = [1,2,3]
l_id = id(l)
del l
g = [1,2,3]
final = set(map(id, get_objects()))
len(orig.symmetric_difference(final))  # 2 in CPython, 6-40+ in IPython
Hans Musgrave
  • 6,613
  • 1
  • 18
  • 37
  • 2
    CPython and IPython **are not difference implementations**. CPython is a Python implementation, IPython is an enhanced interactive REPL that uses CPython. – juanpa.arrivillaga Sep 24 '19 at 21:44
  • "If there isn't enough contiguous space where l was originally placed then the CPython interpreter will choose a different location for the new list g " Probably not going to happen. A `list` object is essentially your generic PyObject header and a pointer to some primitive PyObject array along with some metadata. – juanpa.arrivillaga Sep 24 '19 at 21:45
  • 1
    Note, the standard interactive REPL that comes with the standard python distribution keeps a single reference to the last object evaluated, availble in `_`, however, IPython keeps a much richer history of evaluated expressions, this is what is going on here. – juanpa.arrivillaga Sep 24 '19 at 21:50
  • thanks for the answer, but it isn’t answer my question, why this code return a different values on cpython and ipython – Reznik Sep 24 '19 at 21:53
  • @Reznik it does answer the question, although, the issue may be that it is keeping references to the objects you created which you aren't addressing. In any case, IPython *does not have a different memory management system*, *it uses CPython* – juanpa.arrivillaga Sep 24 '19 at 21:58
  • The extra objects are from the IPython history manager. But even when you disable the history manager, the difference in memory allocating behavior persists. – wim Sep 24 '19 at 22:01
  • @juanpa.arrivillaga That's only partially accurate. Using CPython is not mutually exclusive with creating additional objects, and you'll be hard-pressed to find things like `IPython.core.builtin_trap.BuiltinTrap` bound methods in the standard CPython REPL, but all kinds of extraneous IPython objects appear when the same code is run in that environment. – Hans Musgrave Sep 24 '19 at 22:01
  • @HansMusgrave yea, you're right. IPython is definitely doing all sorts of things that could easily create a few list objects here and there. – juanpa.arrivillaga Sep 24 '19 at 22:02
  • @juanpa.arrivillaga I'm still a little uncertain as to the exact mechanism. None of the extra objects have the same id as the original list. – Hans Musgrave Sep 24 '19 at 22:08
  • 3
    @HansMusgrave well, Python uses a privately managed heap that optimizes things like the allocation of list objects, and the extra objects may simply be affecting how this manages things. In any case, these are all implementation details. – juanpa.arrivillaga Sep 24 '19 at 22:09
  • 1
    They don't need to have the exact same id. They can have other id but their size overlaps (use e.g. `sys.getsizeof` for hints). – wim Sep 24 '19 at 22:09
  • Although I could not find the smoking gun exactly, despite turning off as much IPython features as I could find, this is very probably the correct reason. Have an upvote. – wim Sep 25 '19 at 05:06
  • I check it, and its seems like you are right, if you will check `orig` in your code you would see that conatins the id of `g`, thanks you – Reznik Sep 25 '19 at 17:47