0

I have a column called trip_cost and I want to create a new column called level.

I am doing this:

df['level'] = ''
for i in df.trip_cost:
    if i < 65.0:
        df['Nivel'] = 'low'
    
    elif 65.0 <= i <= 82.0:
        df['Nivel'] = 'medium'
    
    else:
        df['Nivel'] = 'high'

The problem is that all the column is getting the level 'low'instead of the others when it should..

Am I doing something wrong?

  • 1
    Assigning `df['Nivel']` assigns that field in every row. It seems like you expect it to just do the current row of the `for` loop, but how would it know what that row is? – Barmar Apr 21 '22 at 18:03
  • 1
    As a general rule, if you're looping over a dataframe you're probably doing something wrong, since Pandas has built-in operations to filter and update. – Barmar Apr 21 '22 at 18:04

1 Answers1

0

So I think you apply it on whole column, so probably your last record has trip_cost < 65. I suggest something like this:

def label_nivel (row):
   if row['trip_cost'] < 65:
      return 'low'
   if row['trip_cost'] < 82:
      return 'medium'
   return 'high'
   
df['Nivel'] df.apply (lambda row: label_race(row), axis=1)
pbartkow
  • 126
  • 1
  • 10