Does casting a list into tuple result in tuple having memory overhead?

Question

Let's create a lists-of-lists:

l = []
for i in range(1000000):
    l.append(['abcdefghijklmnopqrstuvwxyz', '1.8', '5', 'john@email.com', 'ffffffffff'])

l.__sizeof__() reports 8697440 and process occupies 111 MB.

Now let's try list-of-tuples instead of lists-of-lists:

l = []
for i in range(1000000):
    l.append(('abcdefghijklmnopqrstuvwxyz', '1.8', '5', 'john@email.com', 'ffffffffff'))

l.__sizeof__() reports 8697440 and process occupies 12 MB.

As expected, huge improvement. Now let's cast the list into tuples just before insertion instead:

l = []
for i in range(1000000):
    l.append(tuple(['abcdefghijklmnopqrstuvwxyz', '1.8', '5', 'john@email.com', 'ffffffffff']))

l.__sizeof__() reports 8697440 and process occupies 97 MB.

Why 97 MB, and not 12 MB ??

Is it garbage? gc.collect() reports 0.

Doing a del l releases all the memory down to 4 MB, which is just a little over what a blank interpreter occupies. So, l was actually bloated.

When we cast a list into tuple, is the resulting tuple "impure" compared to tuple created from scratch ?

And if so, is there a workaround to convert list into "pure" tuple?

Your second example is broken - you have `([` to open but `)` to close. Do you actually have a list there, or a tuple, or both? — Daniel Roseman, May 10 '18 at 12:11
@DanielRoseman, sorry that was a typo. As the intention "list-of-tuples" suggests, there is no `[` in the whole line. EDITED. — AneesAhmed777, May 10 '18 at 12:18
@Kasramvd So i guess your link answers why lists are not getting 100% deallocated. **But is there a workaround for list to tuple conversion?** One method might be pickling to disk and loading back. — AneesAhmed777, May 10 '18 at 12:48
Here are some related links: https://stackoverflow.com/questions/50124621/understanding-memory-usage-in-python and https://stackoverflow.com/questions/46664007/why-do-tuples-take-less-space-in-memory-than-lists — Mazdak, May 10 '18 at 12:58
@Kasramvd thank you for re-opening... and now someone posted a real answer too :-) — AneesAhmed777, May 10 '18 at 13:52

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

Short answer

No. Your line of reasoning is flawed because your are not realising that tuples use Constant Folding.
(Hint: No way tuples can hold in 12MB what lists are holding in 111MB !!)

Long answer

Your first example creates a million and one lists. Your third example creates one list and a million tuples. The slight memory difference is due to the fact that tuples are more compactly stored (they don't keep any extra capacity to handle appends, for example).

But in your second example, you have one list, and ONE tuple - it's a constant value, so it gets created once during compilation of the code, and references to that one object get reused when setting each element of the list. If it wasn't constant (for example, if you used i as one of the elements of the tuple), a new tuple would have to be created each time, and you'd get comparable memory usage to example #3.

The third example could theoretically have the same memory usage as #2, but this would require that Python compare each newly-created tuple to all existing tuples to see if they happen to be identical. This would be slow, and of extremely limited benefit in typical real-world programs that don't create huge numbers of identical tuples.

Just to help future noobs like me, I've making some edits to your answer... — AneesAhmed777, May 10 '18 at 13:48

Does casting a list into tuple result in tuple having memory overhead?

1 Answers1

Short answer

Long answer