0

zipped file --> 10folders --> 20 csv files for each folder

  • the zipped file title is yyyy-mm
  • folders titles are yyyy-mm-dd
  • csv files titles are different timings of the day

tried the following code but does not work

import pandas as pd
import os
import glob
     
myzip=zipfile.ZipFile("C:/xxx/xxx/xxx/xxx/2021-01.zip")
for fname in myzip.namelist():
    if 'csv' not in fname:
        pathname = "C:/xxx/xxx/xxx/xxx/2021-01.zip/" + fname
        path = os.getcwd()
        csv_files = glob.glob(os.path.join(pathname, "*.csv"))  
     
        for f in csv_files:
            # read the csv file
            df = pd.read_csv(f)

            # print the location and filename
            print('Location:', f)
            print('File Name:', f.split("\\")[-1])

            # print the content
            print('Content:')
            display(df)
            print()

1 Answers1

0

If it is not necessary to work with zipped files, you can unzip them first:

import zipfile
with zipfile.ZipFile(path_to_zip_file, 'r') as zip_ref:
    zip_ref.extractall(directory_to_extract_to)

And then work with the extracted folders normally.

NesteruS
  • 64
  • 3
  • 1
    but the csv files are very huge so if i extract it, it is not storage friendly – jojolee Aug 11 '21 at 04:27
  • Then, try using [these answers](https://stackoverflow.com/questions/26942476/reading-csv-zipped-files-in-python) on working with zipped CSV files – NesteruS Aug 11 '21 at 04:55