0

I am searching for a memory leak in code of someone else. I found:

def current(self):
    ...
    data = PyBuffer_New(buflen)
    PyObject_AsCharBuffer(data, &data_ptr, &buflen)
    ...
    return VideoFrame(data, self.frame_size, self.frame_mode,
                      timestamp=<double>self.frame.pts/<double>AV_TIME_BASE,
                      frameno=self.frame.display_picture_number)


cdef class VideoFrame:
    def __init__(self, data, size, mode, timestamp=0, frameno=0):
        self.data = data
        ...

In function current() is no free or similar, neither in VideoFrame. Is the PyBuffer automatically freed when the VideoFrame object gets deleted?

P.R.
  • 3,785
  • 1
  • 27
  • 47

1 Answers1

2

The answer is: "it depends; we don't have enough code to tell from your question." It's governed by what type you've told Cython that PyBuffer_New returns. I'll give two simplified illustrating cases and hopefully you should be able to work it out for your more complicated case.

If you tell Cython that it's a PyObject* it has no innate knowledge of that type, and doesn't do anything to keep track of the memory:

# BAD - memory leak!

cdef extern from "Python.h":
    ctypedef struct PyObject
    PyObject* PyBuffer_New(int size)

def test():
    cdef int i
    for i in range(100000): # call lots of times to allocate lots of memory
        # (type of a is automatically inferred to be PyObject*
        # to match the function definition)
        a = PyBuffer_New(1000)

and the generated code for the loop pretty much looks like:

for (__pyx_t_1 = 0; __pyx_t_1 < 1000; __pyx_t_1+=1) {
  __pyx_v_i = __pyx_t_1;
  __pyx_v_a = PyBuffer_New(1000);
}

i.e. memory is being allocated but never freed. If you run test() and look at a task manager you can see the memory usage jump up and not return.

Alternatively, if you tell Cython it's an object that lets Cython deal with it like any other Python object, and manage the reference count correctly:

# Good - no memory leak

cdef extern from "Python.h":
    object PyBuffer_New(int size)

def test():
    cdef int i
    for i in range(100000):# call lots of times to allocate lots of memory
        a = PyBuffer_New(1000)

The generated code for the loop is then

for (__pyx_t_1 = 0; __pyx_t_1 < 100000; __pyx_t_1+=1) {
  __pyx_v_i = __pyx_t_1;
  __pyx_t_2 = PyBuffer_New(1000); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 7; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_2);
    __Pyx_XDECREF_SET(__pyx_v_a, __pyx_t_2);
    __pyx_t_2 = 0;
}

Note the DECREF, which will be where the object is deallocated. If you run test() here you see no long-term jump in memory usage.

It may be possible to jump between these two cases by using cdef for variable (for example in the definition of VideoFrame). If they use PyObject* without careful DECREFs then they're probably leaking memory...

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • thanks! It seems the author has implemented the "correct" `object PyBuffer_New(int size)` way. Also I see that in `VideoFrame` `data` is defined as `cdef readonly object data` . Is that the case where it could leak? – P.R. Sep 09 '15 at 16:22
  • Just found that `__Pyx_GOTREF(__pyx_t_2);` is called in the C-file after `PyBuffer_New`. So I suppose everything is correctly implemented. – P.R. Sep 09 '15 at 16:29
  • @P.R. I can't see anything I'm immediately suspicious of (including for `VideoFrame.data`). Although obviously it's hard to tell from this. The tool of choice for finding memory leaks (on Linux at least) is Valgrind. It's been a while since I made real use of it though, so I can't give any real advice. – DavidW Sep 09 '15 at 17:11