Memory leak when using py7zlib to open .7z archives

Question

I am trying to use py7zlib to open and read files stored in .7z archives. I am able to do this, but it appears to be causing a memory leak. After scanning through a few hundred .7z files using py7zlib, Python crashes with a MemoryError. I don't have this problem when doing the equivalent operations on .zip files using the built-in zipfile library. My process with the .7z files is essentially as follows (look for a subfile in the archive with a given name and return its contents):

with open(filename, 'rb') as f:
    z = py7zlib.Archive7z(f)
    names = z.getnames()
    if subName in names:
        subFile = z.getmember(subName)
        contents = subFile.read()
    else:
        contents = None

return contents

Does anyone know why this would be causing a memory leak once the Archive7z object passes out of scope if I am closing the .7z file object? Is there any kind of cleanup or file-closing procedure I need to follow (like with the zipfile library's ZipFile.close())?

You can investigate it yourself, this SO answer about [python memory leaks](http://stackoverflow.com/a/1435426/1334930) can help you. — Tolio, Aug 07 '15 at 18:55
Your code looks fine. It's probably a bug in the `py7zlib` extension. It's relatively easy (and common) to forget to decrement a reference counter when a writing Python extension. Report it to the extension's author(s). — martineau, Aug 07 '15 at 19:36
Yes, that was a typo; apologies. I tried using objgraph.show_most_common_types like in the example, but it didn't tell me anything particularly useful. I am noticing that py7zlib loads the entire contents of the .7z file uncompressed into memory when trying to read any one subfile. There may be additional memory leaks, but that is almost certainly the cause of the MemoryErrors. — David Pitchford, Aug 07 '15 at 20:12

Memory leak when using py7zlib to open .7z archives

0 Answers0