You are really close, but much better is vectorized solution without apply
- get boolean mask and convert to int
:
mask = (df_transactions['plan_list_price'] == 0) &
(df_transactions['actual_amount_paid'] > 0)
df_transactions['emome'] = mask.astype(int)
If really want slowier apply
:
f = lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0
df_transactions['emome'] = df_transactions.apply(f, axis=1)
Sample:
df_transactions = pd.DataFrame({'A':list('abcdef'),
'plan_list_price':[0,0,0,5,5,0],
'actual_amount_paid':[-1,0,9,4,2,3]})
mask = (df_transactions['plan_list_price'] == 0) & \
(df_transactions['actual_amount_paid'] > 0)
df_transactions['emome1'] = mask.astype(int)
f = lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0
df_transactions['emome2'] = df_transactions.apply(f, axis=1)
print (df_transactions)
A actual_amount_paid plan_list_price emome1 emome2
0 a -1 0 0 0
1 b 0 0 0 0
2 c 9 0 1 1
3 d 4 5 0 0
4 e 2 5 0 0
5 f 3 0 1 1
Timings:
#[60000 rows]
df_transactions = pd.concat([df_transactions] * 10000, ignore_index=True)
In [201]: %timeit df_transactions['emome1'] = ((df_transactions['plan_list_price'] == 0) & (df_transactions['actual_amount_paid'] > 0)).astype(int)
1000 loops, best of 3: 971 µs per loop
In [202]: %timeit df_transactions['emome2'] = df_transactions.apply(lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0, axis=1)
1 loop, best of 3: 1.15 s per loop