I have 2 subsets that have similar columns, but the one column they have in common is column A
.
I have the left df L
and the right df R
.
I want to make sure that any duplicates for column A
seen in L
that are also in df R
are removed from L
- the whole column.
How would one do this?
import pandas as pd
L_df = pd.DataFrame({'A': ['bob/is/cool', 'alice/is/cool', 'jim/is/cool'],
'view': ['A', 'B', 'B']})
R_df = pd.DataFrame({'A': ['ralf/is/cool', 'i/am/cool', 'alice/is/cool'],
'view': ['A', 'B', 'C']})
I want to get the result of this with the result taking away duplicates for column A
, and taking the duplicated value from L
not R
.
So we take alice/is/cool
with a view
value of C
and not B
if that makes sense :)
Output would be
out = pd.DataFrame({'A': ['ralf/is/cool', 'i/am/cool', 'alice/is/cool', 'bob/is/cool', 'jim/is/cool'],
'view': ['A', 'B', 'C', 'A', 'B']})