0

I have a tar.gz file which contains a hierarchy of files, folders and other tar.gz files within them.

I have no idea of the depth of the directory structure, it will vary according to the file

I would like to know a way to write a Python script which will traverse through all the compressed files and extract files with specified file extensions

  • Try the examples from the [docs](https://docs.python.org/3/library/tarfile.html#examples). And also try [this](https://stackoverflow.com/a/35690896/4502878) answer. – Diptangsu Goswami Aug 01 '19 at 11:50

1 Answers1

0

You can use the tarfile module top open tar.gz files, then call getmembers() to get tar file members. Open the ones you want.

Those that are .tar.gz members should be processed recursively (pretty much like you did the top file). The only difference is you'll probably need to pass a file-like-object to TarFile.open instead of a filename.

zmbq
  • 38,013
  • 14
  • 101
  • 171
  • Its not that I just have tar files within tar, it could be any compression. I have tried recursive traversal, but my Python script fails at some point –  Aug 02 '19 at 09:30
  • You need to handle all kinds of compression you want to support. As for recursion failing - you probably have a bug. – zmbq Aug 04 '19 at 06:39