1

I have a function that takes in a df --> modifies the df --> and returns back the modified df.

I have a list dfs containing 5 df - I want to loop over them so that each is modified by the function, something like this:

dfs = [df1, df2, df3, df4, df5]  # df1 to df5 : valid DataFrames

for df in dfs:
    df = function(df)

When I do that the content of the list dfs is not changed, I just end up with a new variable called 'df' that contains the modified information of df5 (The last df in the list).

What am I doing wrong? Is there a way I can achieve this?

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
Youssef Razak
  • 365
  • 4
  • 11

1 Answers1

1

You assign the modified df back to the name df but that will not change the item in the list it represents. You need to store your modified local df back to your list:

dfs = [df1, df2, df3, df4, df5]

for idx, df in enumerate(dfs):
    dfs[idx] = function(df)       # immediately store result in list

would solve your problem.


Full demo:

import pandas as pd

dfs = [pd.DataFrame({"t":[n]}) for n in range(1,6)]

def function(df):
    df["t"] = df["t"] * 100
    return df
  
print(*dfs,"", sep= "\n\n")

for idx, df in enumerate(dfs):
    dfs[idx] = function(df)

print(*dfs, sep="\n\n")

Output:

  t
0  1

  t
0  2

  t
0  3
    
  t
0  4

  t
0  5


    t
0  100

    t
0  200

    t
0  300

    t
0  400

    t
0  500
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69