Creating pandas dataframes within a loop

Question

I am attempting to create approximately 120 data frames based upon a list of files and dataframe names. The problem is that after the loop works, the dataframes don't persist. My code can be found below. Does anyone know why this may not be working?

for fname, dfname in zip(CSV_files, DF_names):
    filepath = find(fname, path)
    dfname = pd.DataFrame.from_csv(filepath)

I can't comment on why `dfname` isn't persisting but can you test just declaring a list outside the loop and appending the created df on each iteration — EdChum, Jun 25 '15 at 15:48
If i add something like `print dfname` after the df has been created within the loop, the df is printed, but just doesn't persist. I don't know why — Jenks, Jun 25 '15 at 15:50

score 0 · Accepted Answer · edited May 23 '17 at 12:29

0

This is a python feature. See this simpler example: (comments show the outputs)

 values = [1,2,3]
 for v in values:
     print v,
 # 1 2 3
for v in values:
    v = 4
    print v, 
# 4 4 4
print values
# [1, 2, 3]
# the values have not been modified

Also look at this SO question and answer: Modifying a list iterator in Python not allowed?

The solution suggestd in the comment should work better because you do not modify the iterator. If you need a name to access the dataframe, you can also use a dictionanry:

dfs = {}
for fname, dfname in zip(CSV_files, DF_names):
    filepath = find(fname, path)
    df = pd.DataFrame.from_csv(filepath)
    dfs[dfname] = df
print dfs[DF_names[1]]

edited May 23 '17 at 12:29

Community

1
1

answered Jun 25 '15 at 15:57

stellasia

5,372
4
23
43

But how I can make all of the separate dataframes with the list? They all need to be separate for querying/spark purposes – Jenks Jun 25 '15 at 16:00
All df are different elements of the list. If you need more "human" names for them, you can use a dictionnary (see my edited answer) – stellasia Jun 25 '15 at 16:04

Creating pandas dataframes within a loop

1 Answers1