I'm having an issue with Pandas dataframes. It seems that Pandas/Python generate a copy of the DF somewhere in my code as opposed to performing the modifications to the original DF.
In the code below, "update_df" still sees the DF with a "file_exists" column, which should have been removed by the previous function.
MAIN:
if __name__ == '__main__':
df_main = load_df()
clean_df2(df_main)
update_df(df_main, image_path_main)
.....
clean_df2
def clean_df2(df): #remove non-existing files from DF
df['file_exists'] = True # add column, set all to True?
.....
df = df[df['file_exists'] != False] #Keep only records that exist
df.drop('file_exists', 1, inplace=True) # delete the temporary column
df.reset_index(drop=True, inplace = True) # reindex if source has gaps
update_df:
def update_df(df, image_path): #add DF rows for files not yet in DF
print(df)
....