I have a pandas df and a bunch of custom functions written to do data checks on survey data. We have a number of exceptions where certain data checks should or should not be done - these are based off a categorical variable or a date variable. When doing something like this:
def data_check(df):
if df[string_col]== 'some string':
df = package.f1(df, other_col1)
df = package.f2(df, other_col1, other_col2)
if df[date_col]> some_datetime_obj:
df = package.f3(df, other_col3)
return(df)
clean_df = data_check(dirty_df)
I get this error:
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
Thanks!!