0

I have two data frames.

first_dataframe

id
9
8
6
5
7
4

second_dataframe

id
6
4
1
5
2
3

Note: My dataframe has many columns, but I need to compare only based on ID | I need to find:

  1. ids that are in first dataframe and not in second [1,2,3]
  2. ids that are in second dataframe and not in first [7,8,9]

I have searched for an answer, but all solutions that I've found doesn't seem to work for me, because they look for changes based on index.

1 Answers1

0

Use set subtraction:

inDF1_notinDF2 = set(df1['id']) - set(df2['id']) # Removes all items that are in df2 from df1
inDF2_notinDF1 = set(df2['id']) - set(df1['id']) # Removes all items that are in df1 from df2

Output:

>>> inDF1_notinDF2
{7, 8, 9} 

>>> inDF2_notinDF1
{1, 2, 3}