0

I am attempting to create approximately 120 data frames based upon a list of files and dataframe names. The problem is that after the loop works, the dataframes don't persist. My code can be found below. Does anyone know why this may not be working?

for fname, dfname in zip(CSV_files, DF_names):
    filepath = find(fname, path)
    dfname = pd.DataFrame.from_csv(filepath)
Jenks
  • 1,950
  • 3
  • 20
  • 27
  • I can't comment on why `dfname` isn't persisting but can you test just declaring a list outside the loop and appending the created df on each iteration – EdChum Jun 25 '15 at 15:48
  • If i add something like `print dfname` after the df has been created within the loop, the df is printed, but just doesn't persist. I don't know why – Jenks Jun 25 '15 at 15:50
  • Have you tried what I suggested of appending to a list? – EdChum Jun 25 '15 at 15:53
  • Yes, when I do that the list contains all of the dfs – Jenks Jun 25 '15 at 15:55

1 Answers1

0

This is a python feature. See this simpler example: (comments show the outputs)

 values = [1,2,3]
 for v in values:
     print v,
 # 1 2 3
for v in values:
    v = 4
    print v, 
# 4 4 4
print values
# [1, 2, 3]
# the values have not been modified

Also look at this SO question and answer: Modifying a list iterator in Python not allowed?

The solution suggestd in the comment should work better because you do not modify the iterator. If you need a name to access the dataframe, you can also use a dictionanry:

dfs = {}
for fname, dfname in zip(CSV_files, DF_names):
    filepath = find(fname, path)
    df = pd.DataFrame.from_csv(filepath)
    dfs[dfname] = df
print dfs[DF_names[1]]
Community
  • 1
  • 1
stellasia
  • 5,372
  • 4
  • 23
  • 43
  • But how I can make all of the separate dataframes with the list? They all need to be separate for querying/spark purposes – Jenks Jun 25 '15 at 16:00
  • All df are different elements of the list. If you need more "human" names for them, you can use a dictionnary (see my edited answer) – stellasia Jun 25 '15 at 16:04