I'd like to combine two dataframes using their similar column 'A':
>>> df1
A B
0 I 1
1 I 2
2 II 3
>>> df2
A C
0 I 4
1 II 5
2 III 6
To do so I tried using:
merged = pd.merge(df1, df2, on='A', how='outer')
Which returned:
>>> merged
A B C
0 I 1.0 4
1 I 2.0 4
2 II 3.0 5
3 III NaN 6
However, since df2 only contained one value for A == 'I', I do not want this value to be duplicated in the merged dataframe. Instead I would like the following output:
>>> merged
A B C
0 I 1.0 4
1 I 2.0 NaN
2 II 3.0 5
3 III NaN 6
What is the best way to do this? I am new to python and still slightly confused with all the join/merge/concatenate/append operations.