2

I have two data frames: df1 as follows:

  col0 col1  col1.1  col3
0    a    d       1     6
1    b    e       5     7

And df2 as follows:

  colx  coly
0    a    10
1    b    20
2    d    50
3    e    40

How do I combine the dataframes in-place such that the final df looks like this?

  col0 col1  col1.1  col3  colx coly
0    a    d       1     6   10   50
1    b    e       5     7   20   40

As far as I understand, the merge method in pandas merges both dataframes based on a given axis. In this case, I want to merge them based on the value in another dataframe. What is the function I am looking for in this case?

lmo
  • 37,904
  • 9
  • 56
  • 69
tandem
  • 2,040
  • 4
  • 25
  • 52

5 Answers5

4

Using merge chained twice:

mrg = df1.merge(df2, left_on='col0', right_on='colx')\
         .merge(df2, left_on='col1', right_on='colx')\
         .drop(columns=['colx_x', 'colx_y'])

Output

  col0 col1  col1.1  col3  coly_x  coly_y
0    a    d       1     6      10      50
1    b    e       5     7      20      40
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • I get the following error: `ValueError: You are trying to merge on object and float64 columns. If you wish to proceed you should use pd.concat` – tandem Aug 26 '19 at 14:20
  • Are you merging on the correct columns, so the arguments `left_on` and `right_on`. Seems like you are selecting an object/string column and a numeric/float column – Erfan Aug 26 '19 at 14:22
  • 1
    I made a mistake. my bad. apologies – tandem Aug 26 '19 at 14:22
3

You could use map:

mapper = df2.set_index('colx')['coly']
df1['colx'] = df1['col0'].map(mapper)
df1['coly'] = df1['col1'].map(mapper)

print(df1)

Output:

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
2

Using applymap and get:

s=df2.set_index('colx')['coly']
df1[['colx','coly']]=df1.iloc[:,:2].applymap(lambda x: s.get(x))
print(df1)

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40
anky
  • 74,114
  • 11
  • 41
  • 70
1

Another solution with replace:

dct = df2.set_index('colx')['coly'].to_dict()
df1['coly_x'] = df1['col0'].replace(dct)
df1['coly_y'] = df1['col1'].replace(dct)

Output:

  col0 col1  col1.1  col3 coly_x  coly_y
0    a    d       1     6     10      50
1    b    e       5     7     20      40
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
0

Here is a method to join on the indexes:

df1.join([df2.iloc[0:2,1].rename('colx'),
          df2.iloc[2:,1].rename('coly').reset_index(drop=True)])

Elements of the desired column are extracted, subset, and renamed. The index of the second series is reset to match. Then these are joined to df1.

this returns

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40
lmo
  • 37,904
  • 9
  • 56
  • 69