0

I have a list of dataframes:

list_of_dfs = [df_1, df_2, df_3]

I can extract each dataframe from this list using:

print(list_of_dfs['df1'].head())

However, I am interested in the following:

  1. I want to extract these dataframes in a for loop, perform some calculations
  2. Append the updated dataframes
Alex Huszagh
  • 13,272
  • 3
  • 39
  • 67
idg23
  • 1
  • 1
  • 1
    > Append the updated dataframes. Append them to **what**? – Alaaaaa Sep 12 '21 at 22:26
  • 1
    Isn't this a simple for loop: `for df in list_of_dfs: do something`? – tdelaney Sep 12 '21 at 22:26
  • Generally iterating over rows/columns in an individual dataframe is considered a bad idea, since it's not particularly fast. Also, calculations generally should be done on individual series or dataframes themselves (although there are numerous exceptions). So, a clearer idea of what you want to do would help a lot. – Alex Huszagh Sep 12 '21 at 22:27
  • Thanks Alexander. My goal is to extract each dataframe from this list of dataframes, modify the dataframe by adding a few columns and then concat all these 3 dataframes into one single dataframe. Hope this help and thanks again for your help. – idg23 Sep 12 '21 at 22:35
  • A general good idea is to do it in all a single step using `pandas.merge`. If you need faster performance, a few solutions exist. You can always do a computation, add a series to the dataframe as a column, and then merge it with another dataframe. See https://stackoverflow.com/questions/44327999/python-pandas-merge-multiple-dataframes. – Alex Huszagh Sep 12 '21 at 22:40

1 Answers1

2

One way could be to loop through the list of dataframes and use the .apply() method. some_function is a reference to a function you define elsewhere that expects a Pandas Series object. An alternative could be to use a lambda if it's really simple

i.e. df["new_column"] = df.apply(lambda row: (your return here), axis=1)

list_of_dfs = [df_1, df_2, df_3]
for df in list_of_dfs:
    df["new_column"] = df.apply(some_function, axis=1)

dfs_combined = pd.concat(list_of_dfs)