0

I am calling an API which will result in a zip file that might contain multiple CSV files:

import zipfile
from io import BytesIO

api_url = res.json()['export_url']
new_res = requests.get(api_url, auth=(user, pass))
filebytes = BytesIO(new_res.content)
myzipfile = zipfile.ZipFile(filebytes)
a = myzipfile.extractall
for name in myzipfile.namelist():
    print(name)

I can clearly see the file names but can't read them into data frame each one of them:

for name in myzipfile.namelist():
    df = pd.read_csv(name)

The error is:

FileNotFoundError: [Errno 2] File data.csv does not exist: 'data.csv'

I tried:

for name in myzipfile.printdir():
    print(name)

and read as csv but didn't work.

alim1990
  • 4,656
  • 12
  • 67
  • 130

1 Answers1

2

The file is still zipped - you cannot just read the contained file as you would normally. Zipfile has its own open function for reading contained files. You can then read the data into a dataframe with pandas.

for name in myzipfile.namelist():
    with myzipfile.open(name) as myfile:
        df = pd.read_csv(myfile)
thshea
  • 1,048
  • 6
  • 18
  • any idea how to extract the file size as well ? – alim1990 Dec 07 '20 at 15:28
  • 1
    It looks like `myfile.seek(0,2)` then `size = myfile.tell()` will give you the size (in bytes) of any file/binary object in python [as detailed here](https://stackoverflow.com/a/283719/11789440). I am assuming this holds for the file object created by zipfile, but I cannot test it right now. – thshea Dec 07 '20 at 20:57