0

I am relatively new to Python. If I have the following two types of dataframes, Lets say df1 and df2 respectively.

Id Name Job                Name Salary Location
1  Jim  Tester             Jim  100    Japan
2  Bob  Developer          Bob  200    US
3  Sam  Support            Si   300    UK
                           Sue  400    France

I want to compare the 'Name' column in df2 to df1 such that if the name of the person (in df2) does not exist in df1 than that row in df2 would be outputed to another dataframe. So for the eg above the output would be:

       Name Salary Location
       Si   300    UK
       Sue  400    France  

Si and Sue are outputed because they do not exist in the 'Name' column in df1.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • 1
    Possible duplicate of [Selecting Unique Rows between Two DataFrames in Pandas](https://stackoverflow.com/questions/23460345/selecting-unique-rows-between-two-dataframes-in-pandas) – user2653663 Sep 06 '18 at 17:33

1 Answers1

1

You can use Boolean indexing:

res = df2[~df2['Name'].isin(df1['Name'].unique())]

We use hashing via pd.Series.unique as an optimization in case you have duplicate names in df1.

jpp
  • 159,742
  • 34
  • 281
  • 339