I want to create a new column in a dataframe based on if/then logic. The rules for the actual problem are the output of a CART tree so fairly complex. The problem that I have is that when I try to apply the function to my dataframe, I get the error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I am pretty sure that this is because the 'if' logic is trying to evaluate the input as a series as opposed to on a row by row basis. I just can't figure out the solution.
To replicate:
import pandas as pd
import numpy as np
np.random.seed(1)
#create sample dataframe
df_test = pd.DataFrame({"llflag": np.random.normal(0,1,100)})
#sample if/else logic
def tree1(df):
if df['llflag'] <= 0.5:
return 4
else:
return 3
return
#attempt to apply function to df
df_test['testRR'] = df_test.apply(tree1(df_test ), axis = 1)
I got the same results with.
df_test['testRR'] = df_test.apply(lambda x: tree1( df_test), axis = 1)'''
what am I missing? Thanks in advance.