Can you please tell me how can I combine CSV files using Pandas with Python ? I have three folders with two csv for each one and I need to combine the six CSV files. I show an overview of the result I'm looking for in the image below!
Asked
Active
Viewed 100 times
-1
-
Combine them how? If you only need to concatenate them one after each other, you can do that with the built-in `csv` module just as well. – AKX Jun 04 '21 at 10:28
-
Does this answer your question? [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) – SunilG Jun 04 '21 at 10:31
2 Answers
0
Do they have the same columns / datatypes. if yes try the below :
# put the paths in a list
paths_list = [path1, path2, ...]
# create an empty DataFrame
df = pd.DataFrame()
# loop through the list and read each csv file and append it to the main DataFrame
for path in paths_list:
df.append(pd.read_csv(path), ignore_index=True)

AmineBTG
- 597
- 3
- 13
-
hi @AmineBTG, thank you for the answer. yes, they have the same columns and the same datatypes as well. is there any way to pick up the path of the csv files without having to type it manually ? – Timeless Jun 04 '21 at 10:35
-
@Mes3oud92 yes there is a way. You could use the OS library to loop through your folders and extract automatically files paths but I do not know your folders structure so I can not help you much on this – AmineBTG Jun 04 '21 at 10:39
-
thanks @AmineBTG, there is an image attached in my post and it shows the folders tree – Timeless Jun 04 '21 at 10:43
0
you can read it with wildcard in linux
df_delta = pd.read_csv('PARENT/Folder*/*Delta.csv')
df_full = pd.read_csv('PARENT/Folder*/*Full.csv')
df_delta.to_csv('PARENT/Parent_Delta.csv')
df_full.to_csv('PARENT/Parent_Full.csv')
in windows you should use glob to get the file path
df_delta = pd.DataFrame()
for filename in glob.glob('PARENT/Folder*/*Delta.csv'):
df_delta = df_delta.append(pd.read_csv(filename))
df_full = pd.DataFrame()
for filename in glob.glob('PARENT/Folder*/*Full.csv'):
df_full = df_full.append(pd.read_csv(filename))
df_delta.to_csv('PARENT/Parent_Delta.csv')
df_delta.to_csv('PARENT/Parent_Full.csv')

williamr21
- 147
- 7
-
thank you @williamr21, i will try this right now and see how this plays out! – Timeless Jun 04 '21 at 10:45
-
I got this error. @willialmr21, can you tell me what does it mean, please? OSError: [Errno 22] Invalid argument: 'C:/Users/Public/PARENT/Folder*/*Delta.csv' – Timeless Jun 04 '21 at 10:58
-