I have been working with pandas to analyze and perform some lengthy operations on a dataset through defined functions (for convenience and also since I use the same functions in operations not involving pandas). I am trying to perform some operations based on which number is larger, using if and else statements.
I have not been able to find a workaround in other answers. Here is a short simplified example of what sort of logical operations I am trying to perform:
import pandas as pd
df = pd.DataFrame({"A": [177,166,155,125,146,149,192,160,111,85],
"B": [26.2,27,26.8,23.4,23.3,17.5,26.4,25.7,18.9,15.8],
"C": [9.2,99.1,29.3,8.6,8,7.2,10,39.4,47.25,4.5,]})
x = 'A'
y = 'B'
z = 'C'
def test(a,b,c):
h = a*b/c
return h
df['D'] = test(df[x],df[y],df[z])
Functions have been working out for me so far like that:
print(df['D'])
0 504.065217
1 45.227043
2 141.774744
3 340.116279
4 425.225000
5 362.152778
6 506.880000
7 104.365482
8 44.400000
9 298.444444
Name: D, dtype: float64
I am looking to get such operations working:
def test2(a,b,c):
if a > b:
return a*c
else:
return b*c
df['E'] = test2(df[x],df[y],df[z])
print(df['E'])
I am getting the obvious error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().