I have the same dataset but over different weeks (so later weeks contain new rows). I want to append the new rows to the original dataframe to create one big dataframe with all unique rows and no duplicates. I can't just take the last week because some get deleted over the weeks.
I tried to use the following code but somehow my final_info dataframe still contains some non-unique values
final_info = data[list(data.keys())[-1]]['all_info']
for week in reversed(data.keys()):
df_diff = pd.concat([data[week]['all_info'],final_info]).drop_duplicates(subset='project_slug',
keep=False)
final_info = final_info.append(df_diff).reset_index(drop=True)
Does somebody see where it goes wrong?