0

maybe someone could show me what i am doing wrong?

import pandas as pd
import operator

def calculate(A, B):
   if (A > 2 and B == True):
       Z = A * 10
   else:
       Z = A * 10000
   return Z

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True
df['C'] = calculate(df.A, df.B)

df

**Error:** `ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().`

Thanks a lot! I couldn't find a solution for my problem on stackoverflow. I am total beginner and started coding today. The Solutions provided by other Questions didn't helped me, sorry for double post.

Sociopath
  • 13,068
  • 19
  • 47
  • 75
Tyrus Rechs
  • 73
  • 2
  • 12
  • I am guessing you want an element-wise logical AND, in which case, here is your answer: https://stackoverflow.com/questions/21415661/logic-operator-for-boolean-indexing-in-pandas tldr: try changing `and` to `&` – Stev Feb 27 '18 at 11:35
  • thanks! but i didn't work. – Tyrus Rechs Feb 27 '18 at 11:38
  • 1
    Don't forget to put some brackets: `if (A > 2) & (B == True):`. Otherwise the order of operations is wrong and you still get the error. – Jeronimo Feb 27 '18 at 12:11

2 Answers2

1

I think need chain conditions by & for AND with numpy.where:

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True

df['C'] = np.where((df.A > 2) & df.B, df.A * 10, df.A * 10000)
print (df)
   A      B      C
0  1   True  10000
1  2   True  20000
2  3  False  30000
3  4  False  40000
4  5   True     50

Detail:

print ((df.A > 2) & df.B)
0    False
1    False
2    False
3    False
4     True
dtype: bool

But if need loopy slow solution (not recommended):

def calculate(A, B):
   if (A > 2 and B == True):
       Z = A * 10
   else:
       Z = A * 10000
   return Z

df['C'] = df.apply(lambda x: calculate(x.A, x.B), axis=1)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Here is an alternative solution to your problem, which removes the (computationally expensive) need for defining an explicit function:

import pandas as pd

df = pd.DataFrame()
df['A'] = 1,2,3,4,5
df['B'] = True,True,False,False,True

df['C'] = df['A'] * 10000
df.loc[(df['A'] > 2) & df['B'], 'C'] /= 1000

#    A      B        C
# 0  1   True  10000.0
# 1  2   True  20000.0
# 2  3  False  30000.0
# 3  4  False  40000.0
# 4  5   True     50.0
jpp
  • 159,742
  • 34
  • 281
  • 339