1

So let's say I have df_1

       Day         Month       Amt              
 --------------- --------- ---------  
  Monday            Jan      10              
  Tuesday           Feb      20 
  Wednesday         Feb      30
  Thursday          April    40
  Friday            April    50 

and df_2

       Month        Amt              
 --------------- ---------
      Jan          999
      Feb          1000000
      April        123456

I want to get the following result when I do a left merge:

       Day         Month       Amt              
 --------------- --------- ---------  
  Monday            Jan      999              
  Tuesday           Feb      1000000 
  Wednesday         Feb      1000000
  Thursday          April    123456
  Friday            April    123456

So basically the 'Amt' values from the right table replace the 'Amt' values from the left table where applicable.

When I try

df_1.merge(df_2,how = 'left',on = 'Month')

I get:

       Day         Month       Amt_X   Amt_Y              
 --------------- --------- ---------  -------
  Monday            Jan      10        999       
  Tuesday           Feb      20        1000000
  Wednesday         Feb      30        1000000
  Thursday          April    40        123456
  Friday            April    50        123456

Anyone know of a simple and efficient fix? Thanks!

1 Answers1

3

This answer is purely supplemental to the duplicate target. That is a much more comprehensive answer than this.

Strategy #1

there are two components to this problem.

Use df_2 to create a mapping.

The intuitive way to do this is

mapping = df_2.set_index('Month')['Amt']

which creates a series object that can be passed to pd.Series.map

However, I'm partial to

mapping = dict(zip(df_2.Month, df_2.Amt))

Or even more obtuse

mapping = dict(zip(*map(df_2.get, df_2)))

Use pandas.Series.map

df_1.Month.map(mapping)

0        999
1    1000000
2    1000000
3     123456
4     123456
Name: Month, dtype: int64

Finally, you want to put that into the existing dataframe.

Create a copy

df_1.assign(Amt=df_1.Month.map(mapping))

         Day  Month      Amt
0     Monday    Jan      999
1    Tuesday    Feb  1000000
2  Wednesday    Feb  1000000
3   Thursday  April   123456
4     Friday  April   123456

Overwrite existing data

df_1['Amt'] = df_1.Month.map(mapping)

Strategy #2

To use merge most succinctly, drop the column that is to be replaced.

df_1.drop('Amt', axis=1).merge(df_2)

         Day  Month      Amt
0     Monday    Jan      999
1    Tuesday    Feb  1000000
2  Wednesday    Feb  1000000
3   Thursday  April   123456
4     Friday  April   123456
piRSquared
  • 285,575
  • 57
  • 475
  • 624