0

I have two dataframes: df1 and df2. df1 has columns A,B,C and df2 has columns B,C,D. For the matching vaues in columns B,C I want to merge both dataframes.

df1 = pd.DataFrame({'A': [2, 3, 4], 'B': [5, 6, 7], 'C': [8, 9, 0]})
df2 = pd.DataFrame({'B': [6, 7, 5], 'C': [9, 0, 8], 'D': [1, 6, 2]})

For example, from the above two dataframes df1 and df2 I want to get df3:

df3 = pd.DataFrame({'A': [2, 3, 4], 'B': [5, 6, 7], 'C': [8, 9, 0], 'D': [2, 1, 6]})

I cannot simply merge by index, as the values for B, C in the index do not match.

How do I do this?

Jeroen
  • 801
  • 6
  • 20

2 Answers2

1

Use sets operation:

>>> df1.merge(df2, on=df1.columns.intersection(df2.columns).tolist())
   A  B  C  D
0  2  5  8  2
1  3  6  9  1
2  4  7  0  6

According the documentation:

on: (label or list)
Column or index level names to join on. These must be found in both DataFrames. If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames.

So df1.merge(df2) works too.

Corralien
  • 109,409
  • 8
  • 28
  • 52
0

Try the following:

import pandas as pd

df1 = pd.DataFrame({'A': [2, 3, 4], 'B': [5, 6, 7], 'C': [8, 9, 0]})
df2 = pd.DataFrame({'B': [6, 7, 5], 'C': [9, 0, 8], 'D': [1, 6, 2]})

df3 = df1.merge(df2, on=['B', 'C'])
Amira Bedhiafi
  • 8,088
  • 6
  • 24
  • 60