Python reading multiple csv files and appending them into a df

Question

I have a folder with mutiple csv files. The name of some of them starts with the string 'REC_' I would like to fetch all files starting with that string and append them into a single df. How can I do that?

The way I fetch just one would be

with open(path_to_my_folder, 'r') as csvfile:
    reader = csv.reader(csvfile)

This way I need to specify the exact file in the 'path_to_my_folder' variable.

as per answer or use `glob` - `https://www.geeksforgeeks.org/how-to-use-glob-function-to-find-files-recursively-in-python/` — Bruno Vermeulen, Jul 13 '21 at 08:38
Does this answer your question? [Import CSV file as a pandas DataFrame](https://stackoverflow.com/questions/14365542/import-csv-file-as-a-pandas-dataframe) — mpx, Jul 13 '21 at 08:39

score 2 · Answer 1 · answered Jul 13 '21 at 08:35

First, you can list all files that starts with REC_ (If some of them are not .csv then you need to check the extension as well). Then you can make a list of dataframes, each containing one REC_ file. Finally, pd.concat() can be used to concatenate the dataframes. Here axis=0 means we add them over the rows (stacking them on top of each other vertically).

REC_file_1.csv
val_1, val_2
1, 2
2, 4

REC_file_2.csv
val_1, val_2
3, 6
4, 8

import os
import pandas as pd

# All files in directory
print(os.listdir())
# ['other_file_1.csv', 'REC_file_1.csv', 'REC_file_2.csv', 'script.py']


rec_file_names = [file for file in os.listdir() if file.startswith('REC_')]
print(rec_file_names)  # ['REC_file_1.csv', 'REC_file_2.csv']

dataframes = []
for filename in rec_file_names:
    dataframes.append(pd.read_csv(filename))

data_concated = pd.concat(dataframes, axis=0)

print(data_concated)
   val_1   val_2
0      1       2
1      2       4
0      3       6
1      4       8

score 2 · Accepted Answer · answered Jul 13 '21 at 08:38

You talk about dataframes, so I guess you are willing to use pandas. You can iterate over your csv-files easily with the build-in pathlib-module. Eventually concatenate your frames:

from pathlib import Path
import pandas as pd

path_dir = Path(path_to_my_folder)

list_dfs = []
for path_file in path_dir.glob('REC_*.csv'):
    df_small = pd.read_csv(path_file)
    list_dfs.append(df_small)
    
df = pd.concat(list_dfs, axis=0)

Python reading multiple csv files and appending them into a df

2 Answers2