1

I wish to unpack some tar archives but I only want to rpcess non-empty ones. I found some code for gzip archives How to check empty gzip file in Python and also this:

async def is_nonempty_tar_file(self, tarfile):
    with open(tarfile, "rb") as f:
        try:
            file_content = f.read(1)
            return len(file_content) > 1
        except Exception as exc:
            self.logger.error(
                f"Reading tarfile failed for {tarfile}", exc_info=True
            )

All the tar archives both empty and non-empty ones seem to have at least thsi character in them \x1f. SO they all pass the test even if they are empty.

How else can I check this?

KZiovas
  • 3,491
  • 3
  • 26
  • 47
  • How about using your command line utils, like `tar -tvf [tarfile]`, and checking if it contains anything? – sarema Jan 12 '22 at 18:11
  • Hey. I want to use python, I am writing a python service. So I am looking for a python tool to do what you suggest – KZiovas Jan 12 '22 at 18:13
  • Why don't you just try unpacking it? Surely that's just as quick as checking first? – JeffUK Jan 12 '22 at 18:17
  • cause I want to make a list with the empty and a list with the non-empty ones also – KZiovas Jan 12 '22 at 18:19

2 Answers2

1

You can list contents of tarfiles with the tarfile module:

https://docs.python.org/3/library/tarfile.html#command-line-options

You probably can just use tarfile.open and check if the descriptor contains anything.

import tarfile

x = tarfile.open("the_file.tar")
x.list()
sarema
  • 695
  • 5
  • 18
1

OK I found a way using the getmembers() method from tarfile module. I made this method that checks for non empty tarfiles:

 def is_nonempty_tar_file(self, archive):
    with tarfile.open(archive, "r") as tar:
        try:
            file_content = tar.getmembers()
            return len(file_content) > 0
        except Exception as exc:
            print(f"Reading tarfile failed for {archive}")
KZiovas
  • 3,491
  • 3
  • 26
  • 47