0

i have a problem, i need to identify file type (tar, tar.gz or zip) i found a solution in this site : Python - mechanism to identify compressed file type and uncompress

but the solution not work for tar file, because tar file have not same start caractere...

magic_dict = {
    "\x1f\x8b\x08": "gz",
    "\x00\x00\x00": "tar",
    "\x50\x4b\x03\x04": "zip"
    }

max_len = max(len(x) for x in magic_dict)

def file_type(filename):
    with open(filename) as f:
        file_start = f.read(max_len)
    for magic, filetype in magic_dict.items():
        if file_start.startswith(magic):
            return filetype
    return "no match"

How can i make detect tar file ?

user7454761
  • 53
  • 1
  • 7

1 Answers1

1

At least GNU tar has a "magic signature", but it is not at offset 0 (the beginning of file), but at offset 257, and it is the string ustar followed by NUL character; see https://en.wikipedia.org/wiki/Tar_(computing)#UStar_format

Błotosmętek
  • 12,717
  • 19
  • 29