I have several dataframes with indices in common (a list of countries) which I am iterating over to perform some manipulations. I have read this answer and know that iterating over dataframes isn't ideal - I have vectorised as much as I can, but the manipulations are somewhat complex, involving comparing rows of different dataframes and transforming them separately with custom algorithms, so some iteration seems unavoidable.
The basic workflow is:
for i in index:
row1 = df1.loc[i]
row2 = df2.loc[i]
row3 = df3.loc[i]
(row1, row2, row3).do.some.comparisons
row1 = row1.apply.some.transformations
row2 = row2.apply.some.algorithms
row3 = row3.some.other.algorithms
I would like to get the dataframes back at the end with the new values correctly assigned to each row.
If I end the for
block with:
df1.loc[i] = row1
df2.loc[i] = row2
df3.loc[i] = row3
then I get a SettingWithCopyWarning
. Looking into this, it seems my code has exactly the structure that the Pandas documentation warns against here (Yikes!).
What's the best way to get around this problem? How do I reliably get my dataframes back with the new values in them?