0

I have two data frames with same column but different values, out of which some are same and some are different. I want to compare both columns and keep the common values.

df1 :

  A B C
  1 1 1
  2 4 6
  3 7 9
  4 9 0
  6 0 1

df2 :

  A D E
  1 5 7
  5 6 9
  2 3 5
  7 6 8
  3 7 0

This is what I am expecting after comparison

df2 :

  A D E
  1 5 7
  2 3 5
  3 7 0
Arkistarvh Kltzuonstev
  • 6,824
  • 7
  • 26
  • 56
Bhavishya
  • 9
  • 1
  • 3

1 Answers1

1

You can use pd.Index.intersection() to find the matching columns and do a inner merge finally reindex() to keep df2.columns:

match=df2.columns.intersection(df1.columns).tolist() #finds matching cols in both df
df2.merge(df1,on=match).reindex(df2.columns,axis=1) #merge and reindex to df2.columns

   A  D  E
0  1  5  7
1  2  3  5
2  3  7  0
anky
  • 74,114
  • 11
  • 41
  • 70
  • Using this, I am getting the merged values and not the common ones. – Bhavishya Jul 26 '19 at 12:20
  • @Bhavishya what is the difference? can you elaborate, this gets exactly what you posted as desired answer – anky Jul 26 '19 at 12:24
  • I have around 250k entries in df1 and around 1.1m entries in df2, so after keeping just the common values I should get less than or equal to 250k entries in df2 but I'm getting around 1m entries instead. – Bhavishya Jul 26 '19 at 12:31