1

I have the following Python code:

import os, psutil
import numpy as np
process = psutil.Process(os.getpid())
print(process.memory_info().rss)

def append(x):
    x.append(np.random.normal(size=(1000,1000)))

a = []
append(a)
append(a)
append(a)
print(process.memory_info().rss)
a = [i[:10] for i in a]
print(process.memory_info().rss) # memory has not been reclaimed!!

I wonder why Python doesn't use less memory when I made all the arrays in a smaller.

martineau
  • 119,623
  • 25
  • 170
  • 301
bumpbump
  • 542
  • 4
  • 17
  • 1
    For starters, `i[:10]` does **not** make the array smaller, it creates a new numpy array object which shares the same underlying buffer, i.e. it's a view. So the buffers are still around. – juanpa.arrivillaga Jan 30 '22 at 01:22
  • 2
    I'm pretty sure Python does not return memory to the OS. It just marks it's own heap as free for re-use. Memory will always be the maximum used. It may in the future be swapped out by the OS. – Keith Jan 30 '22 at 01:24
  • 2
    @Keith it's more due to fragmentation, but if the underlying buffer is still there like juanpa said, it might be irrelevant – Bharel Jan 30 '22 at 01:25

1 Answers1

2

After testing, seems like the underlying reason is exactly what @juanpa suspected.

Basic indexing in numpy returns a view and not a copy of the array, so the underlying buffer is still held. You can access the original array using a[0].base.

If however you'd create a copy of the array like so: a = [i.copy() for i in a], you'd suddenly see a drop in the allocated memory, as the original objects would all be lost and cleared.

Do note however, that if you allocate other objects between the slicing and the copying, you might not release that memory back due to fragmentation.

Run this code and you'll see the difference:

import os, psutil
import numpy as np
process = psutil.Process(os.getpid())
print(process.memory_info().rss)

def append(x):
    x.append(np.random.normal(size=(1000,1000)))

a = []
append(a)
append(a)
append(a)
print(process.memory_info().rss)
a = [i[:10].copy() for i in a]   # Copy the array.
print(process.memory_info().rss)
Bharel
  • 23,672
  • 5
  • 40
  • 80