0

lagging experience using Python gives me hard time to get this loop done. This is the dataframe https://1drv.ms/u/s!AlPw3RIiTz1ChRo9YO4kYCI7n0r0?e=OeiLgx.

I would like to have one more column ( 'customer_type') containing string description (either 'New_Guest', 'Repetative with cancelations', 'Repetative NO cancelations'). Conditions to be met :

  • New_Guest - 'is_repeated_guest' ==0 AND 'previous_cancellations'==0
  • Repetative with cancelations - 'is_repeated_guest' ==1 AND 'previous_cancellations' > 0
  • Repetative NO cancelations - 'is_repeated_guest' ==1 AND 'previous_cancellations'==0

I tried the first conditions without succes

for i in df_test.loc[:,'customer_type'] :
    if ((df_test['is_repeated_guest']==0) & (df_test['previous_cancellations']==0)).all() :
        df_test.loc[:,'customer_type'] = 'New Guest'
    else : df_test.loc[:,'customer_type'] = 0

Does anyone have any suggestions ?

PeteG
  • 1
  • 1

1 Answers1

0

You can filter the df and set the column value without using loops with df.loc[condition, column_name] = new_value. In your case, it would look something like this

df['customer_type'] = ''  # add new column
df.loc[(df['is_repeated_guest']==0) & (df['previous_cancellations']==0), 'description'] = 'New_Guest'
# add your other two conditions

You may also consider adding a default 'customer_type' when you create the new column in case you have situations outside the listed three cases. For example, could you ever have 'is_repeaded_guest'==0 while 'previous_cancellations'==1?

Davis
  • 572
  • 5
  • 12