3

I have a super huge numpy array which memory allocated to it never gets free again. I simply demonstrate my situation so you can see the problem yourself.

Memory allocated to simple numpy arrays will immediately freed up after that variable can be remove (like below which I delete it):

import numpy as np

X = np.ones((40000, 40000))

X.nbytes
12800000000

del(X)

When I run the code above, all the 12 GB memory will free up immediately. But in case of nested numpy arrays things get complicated:

import numpy as np
import random

foo = np.array([np.array([np.ones((256,)) for j in range(random.randint(100, 150))]) for i in range(40000)])

sum(f.nbytes for f in foo)
10240481280

del(foo)

Now the 10 GB of memory will never gets freed even if you run gc.collect() explicitly. Do you guys have any clue?

P.S: The env: Ubuntu + Python 2.7 + numpy 1.15.1

Mehraban
  • 3,164
  • 4
  • 37
  • 60
  • perhaps you can try to flatten the array before deleting it. – Binyamin Even Sep 04 '18 at 14:09
  • @BinyaminEven It may work. It may have many workarounds but I'm looking for the reason behind this weird behavior. – Mehraban Sep 04 '18 at 14:12
  • Have you spent any time poring over the Numpy documentation? Can we rule that out? – wwii Sep 04 '18 at 14:18
  • @wwii This is not my actual code, It's just a minimal code to reproduce the problem. – Mehraban Sep 04 '18 at 14:21
  • What system are you using? If I run `del(foo)` on my Win10 Anaconda Python 3.6, the memory frees up as it should. – JohanL Sep 04 '18 at 14:34
  • @JohanL Well, interesting. I'm on Ubuntu/Python 2.7. – Mehraban Sep 04 '18 at 14:35
  • OK, and what version number for numpy (`numpy.__version__`)? I am at 1.14.3. To me, this seems like an issue with the numpy implementation. – JohanL Sep 04 '18 at 14:38
  • @JohanL numpy version is 1.15.1 – Mehraban Sep 04 '18 at 14:39
  • Work inside functions, and return only what you need... The rest will be garbage collected when you exit the function. – Benjamin Sep 04 '18 at 14:39
  • @Benjamin It simply doesn't. Actually my code is inside a function and I'm not really deleting the variable. I expected it to be freed up after the call but it doesn't. – Mehraban Sep 04 '18 at 14:49
  • OK, good to know because: https://stackoverflow.com/questions/39255371/when-am-i-supposed-to-use-del-in-python/39255472 – Benjamin Sep 04 '18 at 14:59
  • 1
    On ubuntu 16.04 / Python 3.5 / numpy 1.13.1 and can confirm this same behavior. – user2699 Sep 04 '18 at 15:10
  • https://stackoverflow.com/a/27419153/2823755 – wwii Sep 04 '18 at 18:26
  • 2
    https://stackoverflow.com/questions/15455048/releasing-memory-in-python makes me suspect this isn't a numpy specific issue. – user2699 Sep 04 '18 at 22:44
  • 1
    Run your problematic code in a loop: if the memory doesn’t keep climbing, it’s not a leak but merely failure to return memory to the OS (which is common and complicated). – Davis Herring Sep 05 '18 at 02:15
  • @DavisHerring well I tested and it doesn't climb any more. It seems a sever problem as it eats all available memory and then goes to swap ... – Mehraban Sep 05 '18 at 07:50
  • Possible duplicate of [Releasing memory in Python](https://stackoverflow.com/questions/15455048/releasing-memory-in-python) – Davis Herring Sep 05 '18 at 12:29
  • @Mehraban: That memory will eventually all go to swap and stay there, causing no further performance problems. But see the duplicate for a way around it. – Davis Herring Sep 05 '18 at 12:30

0 Answers0