1

Data

pb = {"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222"},"mark_up":{"0":1.2987,"1":1.5625,"2":1.3698,"3":1.3333,"4":1.4589}}
data = {"id":{"0":"K69","1":"K70","2":"K71","3":"K72","4":"K73","5":"K74","6":"K75","7":"K79","8":"K86","9":"K100"},"cost":{"0":29.74,"1":9.42,"2":9.42,"3":9.42,"4":9.48,"5":9.48,"6":24.36,"7":5.16,"8":9.8,"9":3.28},"mark_up_id":{"0":"123","1":"456","2":"789","3":"111","4":"222","5":"333","6":"444","7":"555","8":"666","9":"777"}}
pb = pd.DataFrame(data=pb).set_index('mark_up_id')
df = pd.DataFrame(data=data)

Expected Output

test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)

Or..

df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])

Question

Is there a function that does the above, or is this the best way to go about this type of operation?

Yale Newman
  • 1,141
  • 1
  • 13
  • 22

1 Answers1

2

I would use the second solution you propose or better this:

df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])

I think using update can be uncomfortable because it doesn't return anything.

Let's say Series.fillna is more flexible.

We can also use DataFrame.assign in order to continue working on the DataFrame that the assignment returns.

df.assign( Cost=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost']) )

Time comparision with join method

%%timeit
df['cost']=(df['mark_up_id'].map(pb['mark_up']) + df['cost']).fillna(df['cost'])
#945 µs ± 46 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
test = df.join(pb, on='mark_up_id', how='left')
test['cost'].update(test['cost'] + test['mark_up'])
test.drop('mark_up',axis=1,inplace=True)
#3.59 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

slow..


%%timeit
df['cost'].update(df['mark_up_id'].map(pb['mark_up']) + df['cost'])
#985 µs ± 32.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Finally,I recommend you see: Underastanding inplace and When I should use apply

ansev
  • 30,322
  • 5
  • 17
  • 31
  • Thanks! I guess i'm more so wondering if there is a clever use of functions like applymap etc that leverage lambda or if using join/map is the way to go – Yale Newman Dec 26 '19 at 19:40
  • I think `join` is not necessary here. On the other hand you have to know that `DataFrame.apply` in general is not recommended to use it only when strictly necessary. I will update the solution with a link to the topic on applying SO – ansev Dec 26 '19 at 19:57