I am trying to compare two pandas dataframes containing old and new values. Some of the ids have been deleted in the new data and some have been added, so the indexes are not identical. I am able to concatenate the dataframes using the ids as indexes, columns as the axis, and 'first' and 'second' as the keys, but cannot find a way to preserve the order of both dataframes. The order of data1 is preserved but I would like the order of data2 to also be preserved.
What I tried:
data1={'id':[1, 2, 3, 4, 5],'value':[10, 25, 12, 100, 26]}
data2={'id':[1, 2, 6, 4, 5],'value':[10, 24, 48, 100, 60]}
df = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df_both = pd.concat([df.set_index('id'), df2.set_index('id')],
axis='columns', keys=['First', 'Second'])
print(df_both)
Resulting df_both:
First Second
value value
id
1 10.0 10.0
2 25.0 24.0
3 12.0 NaN
4 100.0 100.0
5 26.0 60.0
6 NaN 48.0
id 6 was placed at the bottom of the dataframe because the index did not exist in data1.
What I want as the resulting df_both:
First Second
value value
id
1 10.0 10.0
2 25.0 24.0
3 12.0 NaN
6 NaN 48.0
4 100.0 100.0
5 26.0 60.0
I would like for the deleted and new rows to maintain their position.