I have several csv files in several zip files in on folder, so for example:
- A.zip (contains csv1,csv2,csv3)
- B.zip (contains csv4, csv5, csv6)
which are in the folder path C:/Folder/
, when I load normal csv files in a folder I use the following code:
import glob
import pandas as pd
files = glob.glob("C/folder/*.csv")
dfs = [pd.read_csv(f, header=None, sep=";") for f in files]
df = pd.concat(dfs,ignore_index=True)
followed by this post: Reading csv zipped files in python
One csv in zip works like this:
import pandas as pd
import zipfile
zf = zipfile.ZipFile('C:/Users/Desktop/THEZIPFILE.zip')
df = pd.read_csv(zf.open('intfile.csv'))
Any idea how to optimize this loop for me?