I am trying to update a set of pandas dataframes with results of some calculations stored in a dataframe. I created the following loop to do this. This seems to work within the loop, but I find that the original dataframe is not updated after the loop is complete!
please can you tell me where I am going wrong? I am using python 3.7.1
and pandas 1.0.5
on Windows 10 machine.
z_score_list = ['LVESV_i', 'LVEDV_i', 'LVSV_i', 'LV_mass_i', 'RVEDV_i', 'RVESV_i', 'RVSV_i'] # columns used for calcuation
df_list = [t1df, t1vsd_df, t1highshunt_df, t1preTVcases_df] #list of dfs to update
print('Before loop shape: ', t1df.shape)
for i, df in enumerate(df_list):
print('before update =', df.shape)
df_z = df[z_score_list]
df_z = calc_Z_scores(df_z,merge=False) # function returns calculated Z-scores in a dataframe
df = df.merge(df_z, on = df.index, how='inner') # here I merge them
df.drop(columns = 'key_0', inplace=True) # drop the additional index
# df.head()
print('after update = ', df.shape)
del(df_z)
# df = df.copy(deep=True) - tried this, but does not work
print('After loop shape: ', t1df.shape)
Here is the output:
Before loop shape: (63, 55)
before = (63, 55)
after = (63, 62)
before = (8, 55)
after = (8, 62)
before = (30, 54)
after = (30, 61)
before = (55, 55)
after = (55, 62)
After loop shape: (63, 55)