1

I am trying to add a new column to a dataframe that uses other columns to create a 'Green Score' for reach row. In the example below, I would like each car 'Model' to have a score that shows how 'green' the car is.

My data (simplified):

cars = {'Model': ['Honda Civic','Toyota Corolla','Smart Car','Tesla'],
        'Green Fuel': [False, False, True, True],
        'Energy Use': ['High','Medium','Low','Low'],
        }

df = pd.DataFrame(cars, columns = ['Model', 'Green Fuel', 'Energy Use'])

df['Green Score'] = 0

A printing the cars dataframe:

    Model   Green Fuel  Energy Use  Green Score
0   Honda Civic     False   High    0
1   Toyota Corolla  False   Medium  0
2   Smart Car   True    Low     0
3   Tesla   True    Low     0

Now, to calculate the Green Scores of each model I am trying this:

for car in df['Model']:
    if df['Green Fuel'] == True:
        df['Green Score'] += 1
    else:
        pass

When I run this, however, I get the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can someone point me in the right direction as to how to solve this error?

Toms Code
  • 1,439
  • 3
  • 15
  • 34

2 Answers2

2

First the best is forget for loops with for in pandas, if exist some vectorized methods - e.g. here is possible convert boolean to integers and sum with mapping another column (only my idea, I try to be creative ;) ):

d = {'High':1,'Medium':2,'Low':3}
df['Green Score'] = df['Green Fuel'].astype(int) + df['Energy Use'].map(d)
print (df)
            Model  Green Fuel Energy Use  Green Score
0     Honda Civic       False       High            1
1  Toyota Corolla       False     Medium            2
2       Smart Car        True        Low            4
3           Tesla        True        Low            4
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

I think that the problem is in your for loop, you are iterating over values of a specific column and then you're trying to verify value of a whole column when writing if df['Green Fuel'] == True.You should iterate over rows and verify the values of each row giving the name of the column ex:

for row in df.iterrows():
  if row['Green Fuel'] == True:
        row['Green Score'] += 1
    else:
        pass```
D_action
  • 83
  • 6
  • 1
    https://stackoverflow.com/questions/24870953/does-pandas-iterrows-have-performance-issues/24871316#24871316 – jezrael Dec 03 '20 at 12:58