Dynamically create dataframes in pandas by reading a list of csv files

Question

I have a folder containing 3 csv files:

a.csv
b.csv
c.csv

To read all the csv's in this folder and create a dataframe, I'm currently doing this:

df1 = pd.read_csv('a.csv')
df2 = pd.read_csv('b.csv')
df3 = pd.read_csv('c.csv')

Is there any way to automate the naming of the dataframes (df1, df2 and df3) and reading of all the csv files in that folder. Say, I have 10 csv files, I don't want to manually write 10 read statements in pandas.

For example, I don't want to write this:

df1 = pd.read_csv('a.csv')
......
......
......

df10 = pd.read_csv('j.csv')

Thanks!

see [this](https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory) question, and [this](https://stackoverflow.com/a/11801338/2204131). — ramesh, Jun 16 '17 at 21:48

bdiamante · Answer 1 · 2017-06-16T23:18:04.600

2

You can do this quite easily if you're willing to access a list of dataframes rather than have df1...dfn explicitly declared:

root= "YOUR FOLDER"
csvs= []  #container for the various csvs contained in the directory
dfs = []  #container for temporary dataframes

# collect csv filenames and paths 
for dirpath, dirnames, filenames in os.walk(root):
    for file in filenames:
        csvs.append(dirpath + '\\' + file)

# store each dataframe in the list
for f in csvs:
    dfs.append(pd.read_csv(f))

Then access like dfs[0] ... dfs[n]

edited Jun 16 '17 at 23:18

answered Jun 16 '17 at 22:30

bdiamante

15,980
6
40
46

In the OP's code he knows that `df1` corresponds to the file named `'a.csv'`. If that's important I suppose op could make `dfs` a dictionary and add them by doing `dfs[f] = read_csv(f)`. – Steven Rumbalski Jun 16 '17 at 22:43

jrovegno · Answer 2 · 2018-04-30T21:54:47.870

1

You can create a dictionary of DataFrames:

import os
import pandas as pd
from glob import glob

dfs = {os.path.splitext(os.path.basename(f))[0]: pd.read_csv(f) for f in glob('*.csv')}
# df1 equivalent dfs['a'] 
dfs['a']

edited Apr 30 '18 at 21:54

answered Jun 18 '17 at 00:44

jrovegno

699
5
11

score 0 · Answer 3 · answered Jun 16 '17 at 22:59

People may downvote this solution since I am asking you to play with global variables. But, this solves your problem.

dir= 'myDir'
for root, dirs, filenames in os.walk(dir):
    for a, f in enumerate(filenames):
        fullpath = os.path.join(dir, f)
        globals()['df%s' % str(a+1)] = pd.read_csv(fullpath)

Dynamically create dataframes in pandas by reading a list of csv files

3 Answers3