102

I've just make excises of gzip on python.

import gzip
f=gzip.open('Onlyfinnaly.log.gz','rb')
file_content=f.read()
print file_content

And I get no output on the screen. As a beginner of python, I'm wondering what should I do if I want to read the content of the file in the gzip file. Thank you.

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
Michael
  • 1,063
  • 2
  • 7
  • 6
  • 7
    Try `print open('Onlyfinnaly.log.gz', 'rb').read().decode('zlib')`. If that doesn't work, can you confirm that the file contains something? – Blender Oct 15 '12 at 19:23
  • Yeah, I'm totally sure there is a file whose name is 'Onlyfinally.log'. And what I'm trying to do is to read the content and select some to store another file. But it turn only the blank line on the screen. – Michael Oct 15 '12 at 19:48

5 Answers5

100

Try gzipping some data through the gzip libary like this...

import gzip
content = "Lots of content here"
f = gzip.open('Onlyfinnaly.log.gz', 'wb')
f.write(content)
f.close()

... then run your code as posted ...

import gzip
f=gzip.open('Onlyfinnaly.log.gz','rb')
file_content=f.read()
print file_content

This method worked for me as for some reason the gzip library fails to read some files.

Matt Olan
  • 1,911
  • 1
  • 18
  • 27
  • 16
    It's slightly preferable to use `with` like in @Arunava's answer, because the file will be closed even if an error occurs while reading (or you forget about it). As a bonus it's also shorter. – Mark Jan 21 '17 at 20:46
80

python: read lines from compressed text files

Using gzip.GzipFile:

import gzip

with gzip.open('input.gz','r') as fin:        
    for line in fin:        
        print('got line', line)
vvvvv
  • 25,404
  • 19
  • 49
  • 81
Arunava Ghosh
  • 935
  • 1
  • 8
  • 8
  • 4
    TIL: The mode argument gzip.open can be any of 'r', 'rb', 'a', 'ab', 'w', 'wb', 'x' or 'xb' for binary mode, or 'rt', 'at', 'wt', or 'xt' for text mode. The default is 'rb'. https://docs.python.org/3/library/gzip.html – Trutane Feb 10 '22 at 20:58
18

If you want to read the contents to a string, then open the file in text mode (mode="rt")

import gzip

with gzip.open("Onlyfinnaly.log.gz", mode="rt") as f:
    file_content = f.read()
    print(file_content)
Michael Hall
  • 2,834
  • 1
  • 22
  • 40
1

I needed a method which could parse both .txt and .txt.gz:

if filename.endswith('.gz'):
    import gzip
    my_open = gzip.open
else:
    my_open = open

with my_open(filename, 'rt') as txt:
    for line in txt:
        print(line)
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
0

for parquet file, pls using pandas to read

data = read_parquet("file.parquet.gzip")
data.head()
Terence Yang
  • 558
  • 6
  • 9