Is there a way to compare 2 dataframes with multiple columns, and varies in length (1386
vs 1383
in the below example)? And only output the rows which have the differences?
Example data:
> df_left
index location date
0 0 Adelaide 2019-01-01
1 1 Adelaide 2019-02-01
2 2 Adelaide 2019-03-01
3 3 Adelaide 2019-04-01
4 4 Adelaide 2019-05-01
... ... ... ...
1381 1381 Western London 2019-03-01
1382 1382 Western London 2019-04-01
1383 1383 Western London 2019-05-01
1384 1384 Western London 2019-06-01
1385 1385 Western London 2019-07-01
[1386 rows x 2 columns]
> df_right
location date
0 Adelaid 2019-01-01
1 Adelaide 2019-02-01
2 Adelaide 2019-03-01
3 Adelaide 2019-04-01
4 Adelaide 2019-05-01
... ... ...
1378 Western London 2019-03-01
1379 Western London 2019-04-01
1380 Western London 2019-05-01
1381 Western London 2019-06-01
1382 Western London 2019-07-01
[1383 rows x 2 columns]
I tried this, but it does not yield the differences
pd.concat([df_left_with_index_columns,df_right_with_index_columns], ignore_index=True, axis=1, join="outer")
0 1 2 3
0 Adelaide 2019-01-01 Adelaide 2019-01-01
1 Adelaide 2019-02-01 Adelaide 2019-02-01
2 Adelaide 2019-03-01 Adelaide 2019-03-01
3 Adelaide 2019-04-01 Adelaide 2019-04-01
4 Adelaide 2019-05-01 Adelaide 2019-05-01
... ... ... ... ...
1381 Western London 2019-03-01 Western London 2019-06-01
1382 Western London 2019-04-01 Western London 2019-07-01
1383 Western London 2019-05-01 NaN NaT
1384 Western London 2019-06-01 NaN NaT
1385 Western London 2019-07-01 NaN NaT
[1386 rows x 4 columns]