I would like to assure that several numpy arrays I'm allocating are properly freed.
I'm curious is there is any module that will let me track an object and print whenever its memory is de-allocated.
For context I'm asking the question because when I reach a certain point in my program the system monitor shows that python is using 300ishMB. Then I execute 2 commands: the first creates a list of numpy arrays which comes to about 1GB in size. The next command performs vstack on this list which further increases memory by about 1GB. Then I take this big numpy array, do some math get an answer which is about 1MB (8000 x 128 uint8 ndarray). The other arrays are no longer needed, so they should be unallocated at the end of the function. However, after I return from the function and collect garbage I'm still left with Python using 1GB of memory. Where did those 700MB come from!?
I'll further illustrate the example with sudo-code
def myfunc(api):
# Gets a list of 1000 1MB arrays
list_ = api.get_arrays()
# Stacks into a big 1GB array
bigarray = np.vstack(list_)
# Summarizes big array using about 1MB
smallarray = api.cluster(bigarray)
# I shouldn't really need del statements
del list_
del bigarray
return smallarray
def main():
# I do preprocessing stuff and have about 300MB in memory
# It costs about 2GB to run this function, but that should all be freed at the end
smallarray = myfunc(api)
# There is 700MB of extra memory allocated! Where did it come from!?
To debug this I was thinking it would be useful to ensure that those numpy arrays are actually gone. Maybe someone has a better idea, but hopefully someone at least has an idea.