I have a hypothetical question regarding the memory usage of lists in python. I have a long list my_list
that consumes multiple gigabytes if it is loaded into memory. I want to loop over that list and use each element only once during the iteration, meaning I could delete them from the list after looping over them. While I am looping, I am storing something else in memory, meaning the memory I allocated for my_list
is now needed for something else. Thus, ideally, I would like to delete the list elements and free the memory while I am looping over them.
I assume, in most cases, a generator would make the most sense here. I could dump my list to a csv file and then read it line by line in a for-loop. In that case, my_list
would never be loaded into memory in the first place. However, let's assume for the sake of discussion I don't want to do that.
Is there a way of releasing the memory of a list as I loop over it? The following does NOT work:
>>> my_list = [1,2,3]
>>> sys.getsizeof(my_list)
80
>>> my_list.pop()
>>> sys.getsizeof(my_list)
80
or
>>> my_list = [1,2,3]
>>> sys.getsizeof(my_list)
80
>>> del my_list[-1]
>>> sys.getsizeof(my_list)
80
even when gc.collect()
is called explicitly.
The only way that I get to work is copying the array (which at the time of copying would require 2x the memory and thus is again a problem):
>>> my_list = [1,2,3]
>>> sys.getsizeof(my_list)
80
>>> my_list.pop()
>>> my_list_copy = my_list.copy()
>>> sys.getsizeof(my_list_copy)
72
The fact that I don't find information on this topic indicates to me that probably the approach is either impossible or bad practice. If it should not be done this way, what would be the best alternative? Loading from csv as a generator? Or are there even better ways of doing this?
EDIT: as @Scott Hunter pointed out, the garbage collector works for much larger lists:
>>> my_list = [1] * 10**9
>>> for i in range(10):
... for j in range(10**8):
... del my_list[-1]
... gc.collect()
... print(sys.getsizeof(my_list))
Prints the following:
8000000056
8000000056
8000000056
8000000056
8000000056
4500000088
4500000088
2531250112
1423828240
56