gzip
itself will raise an OSError
if it's not a gzipped file.
>>> with gzip.open('README.md', 'rb') as f:
... f.read()
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/Users/dennis/.asdf/installs/python/3.6.6/lib/python3.6/gzip.py", line 276, in read
return self._buffer.read(size)
File "/Users/dennis/.asdf/installs/python/3.6.6/lib/python3.6/gzip.py", line 463, in read
if not self._read_gzip_header():
File "/Users/dennis/.asdf/installs/python/3.6.6/lib/python3.6/gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'# ')
Can combine this approach with some others to increase confidence, such as checking the mimetype or looking for a magic number in the file header (see other answers for an example) and checking the extension.
import pathlib
if '.gz' in pathlib.Path(filepath).suffixes:
# some more inexpensive checks until confident we can attempt to decompress
# ...
try ...
...
except OSError as e:
...