Suppose I have 2 dataframes:
df1 = pd.DataFrame({
'a': [0, 0, 0, 1, 1, 1, 1],
'b': [0, 0, 1, 1, 1, 1, 1],
})
df2 = pd.DataFrame({
'a': [0, 0, 0, 1, 1],
'b': [0, 0, 0, 1, 1],
})
I want to compare both these data frames and find all the extra rows in df1 that are not in df2.
The desired output should be like this:
a | b |
---|---|
0 | 1 |
1 | 1 |
1 | 1 |
I have tried merge but this creates extra results since there are duplicates and I don't want to remove them.
Is there a good way of approaching this?