-2

I have two dataframes with same columns. Only one column has different values. I want to concatenate the two without duplication.

df2 = pd.DataFrame({'key': ['K0', 'K1', 'K2'],'cat': ['C0', 'C1', 'C2'],'B': ['B0', 'B1', 'B2']})
df1 = pd.DataFrame({'key': ['K0', 'K1', 'K2'],'cat': ['C0', 'C1', 'C2'],'B': ['A0', 'A1', 'A2']})

df1
Out[630]: 
  key cat   B
0  K0  C0  A0
1  K1  C1  A1
2  K2  C2  A2
df2
Out[631]: 
  key cat   B
0  K0  C0  B0
1  K1  C1  B1
2  K2  C2  B2

I tried:

result = pd.concat([df1, df2], axis=1)
result
Out[633]: 
  key cat   B key cat   B
0  K0  C0  A0  K0  C0  B0
1  K1  C1  A1  K1  C1  B1
2  K2  C2  A2  K2  C2  B2

The desired output:

  key cat   B_df1  B_df2
0  K0  C0    A0     B0
1  K1  C1    A1     B1
2  K2  C2    A2     B2

NOTE: I could drop duplicates afterwards and rename columns but that doesn't seem efficient

mobelahcen
  • 414
  • 5
  • 22

1 Answers1

5

pd.merge will do the job

pd.merge(df1,df2, on=['key','cat'])

Output

  key cat   B_x   B_y
0  K0  C0    A0    B0
1  K1  C1    A1    B1
2  K2  C2    A2    B2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
iamklaus
  • 3,720
  • 2
  • 12
  • 21