0

I have a df1 that looks like this:

ParentID   Name
12         kids
29         jdö

and a second df2 that looks like this:

ParentID   Location
34         56789
12         5608

I want to check whether all parentIDs from df1 are present in the df2 or not. If not, I want to extract them in a new df for example:

ParentID   Name
29         jdö

I believe a join could be used but I'm not sure how

result = left.join(right, on=["key1", "key2"], how="inner")
x89
  • 2,798
  • 5
  • 46
  • 110
  • [`pd.merge`](https://stackoverflow.com/questions/53645882/pandas-merging-101) has the [`indicator`](https://pandas.pydata.org/docs/reference/api/pandas.merge.html) parameter for that. – Michael Szczesny Sep 13 '21 at 14:33
  • You can use `NOT IN` like `df1[~df1['ParentID'].isin(df2['ParentID'])]` or check the indicator if wanting merge as indicated by the other comments and answers. [merge indicator answer](https://stackoverflow.com/a/47107164/15497888) or [isin answer](https://stackoverflow.com/a/44318806/15497888) – Henry Ecker Sep 13 '21 at 14:38

1 Answers1

1

Use indicator= parameter and then filter the first dataframe:

x = df1[df1.merge(df2, how="left", indicator=True)._merge.eq("left_only")]
print(x)

Prints:

   ParentID Name
1        29  jdö
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • I don't think it's accurate. Because it prints certain parentIDs and when I check them in the second dataset manually, they are present there – x89 Sep 15 '21 at 09:19