Pandas apply ValueError: The truth value of a Series is ambigous

Question

I'm trying to create a new feature using

df_transactions['emome'] = df_transactions['emome'].apply(lambda x: 1 if df_transactions['plan_list_price'] ==0 & df_transactions['actual_amount_paid'] > 0 else 0).astype(int)

But it raises error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I create a new column that returns 1 when plan_list_price is 0 and actual_amount_paid is >0 else 0?

I would like to still use pandas apply.

Because I've met this problem few times before and I want to learn the proper way of using pandas apply. — Chia Yi, Jan 26 '18 at 11:21
The proper way of using apply... is to not use it at all ;) Also, the reason is because you used & when you should have used `and`. Don't use them interchangeably. `&` is logical AND _only_ in the context of dataframes. — cs95, Jan 26 '18 at 11:22
The problem is not apply per se. It is your misconception on how to use multiple logical conditions, for which there is a duplicate. — cs95, Jan 26 '18 at 11:23
It's not a 1:1 duplicate... but here it is: https://stackoverflow.com/questions/22591174/pandas-multiple-conditions-while-indexing-data-frame-unexpected-behavior — cs95, Jan 26 '18 at 11:25
Take note that in this statement:`df_transactions['plan_list_price'] ==0 & df_transactions['actual_amount_paid'] > 0`, the order of operator is such that python will evaluate it this way: `(df_transactions['plan_list_price'] ==0 & df_transactions['actual_amount_paid']) > 0` which is what gives you the error. — Aditya Santoso, Apr 11 '19 at 02:21

jezrael · Answer 1 · 2018-01-26T11:26:21.747

You are really close, but much better is vectorized solution without apply - get boolean mask and convert to int:

mask = (df_transactions['plan_list_price'] == 0) & 
       (df_transactions['actual_amount_paid'] > 0)
df_transactions['emome'] = mask.astype(int)

If really want slowier apply:

f = lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0
df_transactions['emome'] = df_transactions.apply(f, axis=1)

Sample:

df_transactions = pd.DataFrame({'A':list('abcdef'),
                                'plan_list_price':[0,0,0,5,5,0],
                                'actual_amount_paid':[-1,0,9,4,2,3]})


mask = (df_transactions['plan_list_price'] == 0) & \
       (df_transactions['actual_amount_paid'] > 0)
df_transactions['emome1'] = mask.astype(int)

f = lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0
df_transactions['emome2'] = df_transactions.apply(f, axis=1)
print (df_transactions)

   A  actual_amount_paid  plan_list_price  emome1  emome2
0  a                  -1                0       0       0
1  b                   0                0       0       0
2  c                   9                0       1       1
3  d                   4                5       0       0
4  e                   2                5       0       0
5  f                   3                0       1       1

Timings:

#[60000 rows]
df_transactions = pd.concat([df_transactions] * 10000, ignore_index=True)

In [201]: %timeit df_transactions['emome1'] = ((df_transactions['plan_list_price'] == 0) & (df_transactions['actual_amount_paid'] > 0)).astype(int)
1000 loops, best of 3: 971 µs per loop

In [202]: %timeit df_transactions['emome2'] = df_transactions.apply(lambda x: 1 if x['plan_list_price'] ==0 and x['actual_amount_paid'] > 0 else 0, axis=1)
1 loop, best of 3: 1.15 s per loop

I would like to use df_transactions['emome'] = df_transactions['emome'].apply(xxx), how can i fill in the xxx part? — Chia Yi, Jan 26 '18 at 11:20

score 0 · Answer 2 · answered Apr 10 '19 at 20:49

A few issues:

On the right side of the equation, the new field (emome)is not created yet.
The lambda function is on x, not on df_transactions, which does not exist in this scope.
You need to specify axis since you are applying to each row (default is to each column).

From Doc:

axis : {0 or ‘index’, 1 or ‘columns’}, default 0 Axis along which the function is applied:

0 or ‘index’: apply function to each column. 1 or ‘columns’: apply function to each row.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

Pandas apply ValueError: The truth value of a Series is ambigous

2 Answers2

Linked