0

I have a file "my_file.csv.gz" which I've opened using:

file = gzip.open("my_file.csv.gz", mode = "r")

How can I get the column names from this result? Do I have to do file.read()? When I log out what file is using logger.info(f"file: {file}") I get

<gzip _io.TextIOWrapper encoding='UTF-8' 0x7fc31958faf0>

and I'm not really sure what to do with this result. Thanks!

  • You can treat that like a file. Pass it anywhere you would pass a file object. – Tim Roberts Oct 19 '22 at 19:57
  • Have you tried just passing the TextIOWrapper instance you got to `csv.reader` ? It does not seem to be more complicated than that. – jsbueno Oct 19 '22 at 19:57
  • Duplicate, look at this: https://stackoverflow.com/questions/12902540/read-from-a-gzip-file-in-python – orby Oct 19 '22 at 19:59

1 Answers1

0

I get gzipped files a lot for work. I like using Pandas to read the files into a dataframe using pd.read_csv and specifying the compression type as gzip. Not sure about your files but mine also have | as a separator rather than the default , so I need to specify that as well.

import pandas as pd

df = pd.read_csv('my_file.csv.gz', compression='gzip', sep='|') #might be a different separator in your case
amance
  • 883
  • 4
  • 14