0

Hi I am getting massive memory usage from my python program.

I have simplified my code, here is the main function

points = []
read_pcd(TEST1_PCD, points)
del points[:] # i also tried del points
# program exits here 

the problem is my point-cloud data set is large, ~ 1 million points. I process it it and covert it into an elevation map so I no longer need the points... however the memory allocated to the points remain. I have tried del points as well. As you can see in the memory profiler that del the points only frees 7 Mb.... Does python not bother to free the memory that the list elements occupied? Because I am worried about running out of memory later in my project.

This is the memory profiler I used https://pypi.python.org/pypi/memory_profiler

here is the read_pcd function for refference

def read_pcd(fname, points):
    data_start = False
    with open(fname, 'r') as f:
        for i, line in enumerate(f):
            words = line.split()
            if words[0] == "DATA":
                data_start = True
            elif data_start == True:
                point = Point(float(words[0]), float(words[1]), float(words[2]))
                points.append(point)

    Line #    Mem usage    Increment   Line Contents
================================================
    17   23.559 MiB    0.000 MiB   @profile
    18                             def main():
    19   24.121 MiB    0.562 MiB       rospy.init_node('traversability_analysis_node')
    20   24.129 MiB    0.008 MiB       points = []
    21 1322.910 MiB 1298.781 MiB       read_pcd(TEST1_PCD, points)
    22 1315.004 MiB   -7.906 MiB       del points[:]



class Point(object):
def __init__(self, x=0.0, y=0.0, z=0.0, intensity=255, cart=True, range_m=0, az=0, elv=0):
    # http://www.mathworks.com.au/help/matlab/ref/cart2sph.html
    DEG2RAD = m.pi/180.0
    if cart == True:
        self.x = x
        self.y = y
        self.z = z
        self.intensity = intensity
        self.range = np.sqrt(x**2 + y**2 + z**2)
        self.azimuth = np.arctan2(y, x)
        r_xy = np.sqrt(x**2 + y**2)
        self.elvation = np.arctan2(z, r_xy )
    else:
        elv = elv*DEG2RAD
        az = az*DEG2RAD
        self.x = range_m*np.cos(elv)*np.cos(az)
        self.y = range_m*np.cos(elv)*np.sin(az)
        self.z = range_m*np.sin(elv)
        self.range = range_m
        self.azimuth = az
        self.elvation = elv
        self.intensity = intensity

profiler output when calling gc.collect

    Line #    Mem usage    Increment   Line Contents
================================================
    18   23.555 MiB    0.000 MiB   @profile
    19                             def main():
    20   24.117 MiB    0.562 MiB       rospy.init_node('traversability_analysis_node')
    21   24.125 MiB    0.008 MiB       points = []
    22 1322.914 MiB 1298.789 MiB       read_pcd(TEST1_PCD, points)
    23 1322.914 MiB    0.000 MiB       gc.collect()
    24 1315.008 MiB   -7.906 MiB       del points
    25 1315.008 MiB    0.000 MiB       time.sleep(5)
  • Why do you do `del points[:]` instead of `del points` ? – bgusach Feb 25 '14 at 12:18
  • yea i tried both same result. – SentinalBais Feb 25 '14 at 12:24
  • You already got the answers below. Now do you really need to load all your data in memory ? If not, turning your function into a generator and iterating over it would avoid the whole problem. – bruno desthuilliers Feb 25 '14 at 12:43
  • yea i guess I will have to do it. I actually stored the list in a numpy array. And in order to order to initialise a numpy array you need a list object. I thought it would increase speed if I gather all the points in a list and then initialise the numpy array, rather than using the append function http://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html. Which I assumed would be slow because of array resizing.. – SentinalBais Feb 25 '14 at 12:56
  • here they recommended http://stackoverflow.com/questions/5064822/how-to-add-items-into-a-numpy-array not using the append function, but rather to create the array all at once. But this is not relevant to the question. Just explaining why I wanted to get all the data in memory. – SentinalBais Feb 25 '14 at 13:06

2 Answers2

1

Unfortunately, you may not be able to free all the memory :

http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm

Also, try gc.collect(), and see the second answer :

How can I explicitly free memory in Python?

Community
  • 1
  • 1
fredtantini
  • 15,966
  • 8
  • 49
  • 55
  • I guess the points object is too large like the article says... does this mean points will never be freed during the life time of the application? Or is the memory considered freed for the python interpretor while it looks occupied from the OS's point of view... – SentinalBais Feb 25 '14 at 12:48
0

Deleting an object happens if it is not referenced any longer.

This memory is freed process-internally and can be reused later again if there is a request for about the same size. Only if certain conditions are met, memory is returned to the OS again.

If it was a larger object, it might be that memory management decided instead to directly call mmap() from the OS and thus get memory exclusively for this object. Upon freeing it, the memory gets available immediately.

If you do del points[:], only the contents are freed, I am not sure if the array holding the references to the former contents is shrinked; doing del points should be the better choice.

Besides, it is possible that Point() internally keeps a reference to the created object (I am not sure why it should do so, however). In this case, del points won't free them as they are still referenced internally.

glglgl
  • 89,107
  • 13
  • 149
  • 217