-1

I have 2 dataframes:

df1 has between 100-300 rows depending on the day
df2 contains between 10,000-40,000 depending on the day

The columns are as such:

df1:

ID      STATE     STATUS
1         NY      ACCEPTED
1         PA      ACCEPTED
1         CA      ACCEPTED
2         NY      ACCEPTED
3         NY      ACCEPTED

df2:

ID       COUNTRY    STATUS
1          US       ACCEPTED
2          US       
3          US       ACCEPTED
4          US       
5          US       ACCEPTED

I need to be able to take each entry from df1 and determine which entries from df1 have an accepted STATUS in df2. All the entries in df1 have been accepted, so the only check I need is whether they also were accepted in df2.

What I am not figuring out is:

How do I locate the same ID, then check the STATUS of that row and return true or false for each?

The bonus is that after that I still need to then extract all the ID's from df2 that are not accepted so that I can use them, so I cannot destroy df2.

jkdev
  • 11,360
  • 15
  • 54
  • 77
Randor
  • 19
  • 3
  • 2
    There are likely other solutions, but [`pandas.DataFrame.merge`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html) is a fairly intuitive one. Note the `how` and `indicator` arguments. – Brendan Jul 14 '19 at 23:37

1 Answers1

0

You could merge both dataframes and check the status with pd.merge:

pd.merge(left=df_a, right=df_b, on='id',suffixes=('_df_a','_df_b'))

enter image description here