2

I have two dataframes:

df0 = pd.DataFrame({'a':[9,8,7,6]},index=[0,1,2,4])
df1 = pd.DataFrame({'b':[5,6,4,7]},index=[2,4,6,8])

I am aiming to get the rows of df0 which have indices not appearing in df1. I perform a left-excluxing join (more info on joins and image source):

enter image description here

I do:

res = \
df0.merge(
    df1,how='left',left_index=True, right_index=True, indicator=True
).query('_merge=="left_only"').drop(['b','_merge'],axis=1)

Which returns the desired dataframe but seems an overkill. I thought about using filter as well:

df0.filter(items=df1.index,axis=0)

but this returns the rows from df0 which I want to drop, not the ones I want to keep.

Is there an easier way to do this than the left-excluding join shown above?

zabop
  • 6,750
  • 3
  • 39
  • 84

2 Answers2

5

Use Index.isin with inverted mask by ~ in boolean indexing:

res = df0[~df0.index.isin(df1.index)]
print (res)
   a
0  9
1  8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

You can also do it by dropping the rows in df0 that contain an index in df1:

df0.drop(df0[(df0.index.isin(df1.index))].index, inplace=True)
Jason
  • 75
  • 4