0

When I execute the program below, memory increases very quickly, so I suppose that memory used in the function named "secundary_function" isn't liberate. If I copy the element I append to the list the problem or if I don't use secundary_function the problem disappears. I'd like to understand why the copy is necessary here and why secundary_function has an influence on the memory used..

import numpy as np
import time

def main_function(N):
    liste_images = []

    for i in range(N) :
        images = np.zeros((3000,25,25))
        time.sleep(0.05)
        secundary_function(images)
        liste_images.append(images[0])

def secundary_function(images):
    conservee = np.arange(len(images))
    images[conservee]

main_function(6000)

Thank you for your answers and sorry for my english !

Yohan P.
  • 1
  • 2
  • It looks like you are generating a 3000x25x25 3D-array 6000 times. Is that your intent? – roadrunner66 Mar 30 '16 at 22:52
  • Also see this link for the freeing of memory: http://stackoverflow.com/questions/18310668/is-freeing-handled-differently-for-small-large-numpy-arrays – roadrunner66 Mar 30 '16 at 22:59
  • In the original program images is each time a different 3D-array and there is a test which select some of 25x25 2D-arrays that I want to keep in the list liste_images. But in my mind, memory should be liberated each time I create a new 3D-array "images". – Yohan P. Mar 30 '16 at 23:02
  • Your code has some problems. You need to understand French to know that `main_function` is the same as `fonction_principale`. And the last line of `secondary_function` makes no sense. You're indexing an array with another array? – Roland Smith Mar 30 '16 at 23:02
  • In Python, variables are really labels pointing to objects. Objects can only be removed from memory once there are no more references to them. Maybe your original program keeps some references around? – Roland Smith Mar 30 '16 at 23:07
  • Sorry, I've translated the name of my functions but I've forgotten to change the last line. – Yohan P. Mar 30 '16 at 23:09
  • @RolandSmith Yes, I want to indexing an array with another array : in the original program I need to apply a cascade of tests on my 3D-array `images`. `conservee` is updated at each step of the cascade and contains the index of the elements of `images` wich have passed the previous tests, so `images[conservee]` is the 3D-array which have to pass the following test. – Yohan P. Mar 30 '16 at 23:17
  • @RolandSmith, when I execute the code I've published here, the memory increases quickly, not only in my original program. – Yohan P. Mar 30 '16 at 23:21

1 Answers1

2

In this line:

liste_images.append(images[0])

images[0] creates a view of the 3000x25x25 images array. It means that the result of images[0] that you append to liste_images has a reference to the entire 3000x25x25 array. This big array will not be garbage collected. When you do a copy, you create a new 25x25 array and the big array can be freed in the next iteration of the for loop.

  • But I don't understand why, if I don't use the function `secundary_function`, the problem disappears... – Yohan P. Mar 30 '16 at 23:44
  • @morningsun Ok thank you. I think I've understand : when i replace the line `images[conservee]` by `images*2` the problem is the same. I just don't understand where the array is stocked when I don't use it if it is not in RAM. On hard disk ? It would be very slow no ? – Yohan P. Apr 01 '16 at 18:05
  • @Yohan.P - only the size of the array plus some other metadata is stored initially. The operating system keeps track of when the array is actually used and only then it allocates the (part of) the array that is used in memory. –  Apr 01 '16 at 18:45