I have a dataframe with 3 columns: a_id, b, c (with a_id as a unique key) and I would like to assign a score for each row based on the number in b and c columns. I have created the following:
def b_score_function(df):
if df['b'] <= 0 :
return 0
elif df['b'] <= 2 :
return 0.25
else:
return 1
def c_score_function(df):
if df['c'] <= 0 :
return 0
elif df['c'] <= 1 :
return 0.5
else:
return 1
Normally, I would use something like this:
df['b_score'] = df(b_score, axis = 1)
df['c_score'] = df(c_score, axis = 1)
However, the above approach will be too long if I have multiple columns. I would like to know how can I create a loop for the selected columns? I have tried the following:
ds_cols = df.columns.difference(['a_id']).to_list()
for col in ds_cols:
df[f'{col}_score'] = df.apply(f'{col}_score_function', axis = 1)
but it returned with the following error:
'b_score_function' is not a valid function for 'DataFrame' object
Can anyone please point out what I did wrong? Also if anyone can suggest how to create a reusable, that would be appreciated.
Thank you.