0

I want to read all csv from zipped file but there is more than one CSV present in zipped file. attached zipped file URL. Anyone answer would be appreciated. i have tried this but get error that there is more than one CSV file in zipped file.

df = pd.read_csv('http://dados.cvm.gov.br/dados/FIDC/DOC/INF_MENSAL/DADOS/inf_mensal_fidc_202006.zip', compression='zip', header=1, sep=';', quotechar='"')
print(df)
  • First you have to unzipped the file. See this [SO question](https://stackoverflow.com/questions/3451111/unzipping-files-in-python) – asantz96 Aug 07 '20 at 18:31
  • 1
    See if this article I wrote helps : [zip into Pandas](https://samukweku.github.io/data-wrangling-blog/python/pandas/compressed%20data/2020/07/21/Extract-DataFrame-from-Compressed-Data-into-Pandas.html) – sammywemmy Aug 07 '20 at 19:33

1 Answers1

1
import urllib.request
import zipfile
import pandas as pd

#first you unzip it
url = 'http://dados.cvm.gov.br/dados/FIDC/DOC/INF_MENSAL/DADOS/inf_mensal_fidc_202006.zip'
file_name = "dados.zip"
path = os.getcwd()
destination = path + "\\" + file_name
urllib.request.urlretrieve(url, destination)

#second you extract evrything from the zip into your folder destination
directory_name = "dados_folder"
zip_destination = path + "\\" + directory_name
os.mkdir(zip_destination)
with zipfile.ZipFile(destination, 'r') as zip_ref:
    zip_ref.extractall(zip_destination)
    
#now you read each csv one by one and put it into a dataframe
roman_numerals = ["I", "II", "III", "IV", "IX", "V", "VI", "VII", "X_1", "X_2", "X_3", "X_4", "X_5", "X_6", "X_7", "X_1_1"]
for x in roman_numerals:
    name_csv = path + "\\" + "inf_mensal_fidc_tab_" + x + "_202006.csv"
    with open(name_csv, "r+") as f: # no need to close it as by using with it closes by itself
        df = read_csv(name_csv)
Ariadne R.
  • 452
  • 11
  • 24