-1

I have an Excel file like this enter image description here

I want to say that if the average purchase of these people in 3 months was below 60, these people should be fined 120 thousand dollars.

I tried to solve the problem by writing this code

import pandas as pd

file_df = pd.read_excel('users.xlsx')


def penalty_calculation(df):
    sum_of_three_month = df["First month purchase "] + df["Purchase of the second month"] + df["Purchase of the third month"]
    df["Average purchase of 3 months"] = sum_of_three_month // 3
    if df["Average purchase of 3 months"] <= 60:
        df["penalty"] = "$120000"


penalty_calculation(file_df)

But I face this error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

what is the problem ?

  • `if df["Average purchase of 3 months"] <= 60:` is equivalent to asking `if [10, 20, 30] < 20:`. It doesn't make sense - the list (or `Series` in this case) cannot be compared to an int, so are you asking it to iterate over the values? Pandas doesn't know - use `np.where()` or `df.loc` – roganjosh Dec 28 '22 at 09:29
  • Can you please give a code example of how to use df.loc for such comparisons? – user17713444 Dec 28 '22 at 09:34
  • a Series *can* be compared to an integer, it's one of the most useful things in the library; direct boolification of the result is what's problematic (implicit with the if statement here) – Mustafa Aydın Dec 28 '22 at 09:34
  • almost 2 million times seen duplicate says things about that https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o – Mustafa Aydın Dec 28 '22 at 09:35
  • I have read this question before, the problem is the use of or, it is different for me – user17713444 Dec 28 '22 at 09:38
  • i suspected that; the accepted answer (or probably others too) touch upon your case as well... – Mustafa Aydın Dec 28 '22 at 09:40
  • The only question left for me is how to check multiple conditions in a chain? – user17713444 Dec 28 '22 at 09:50

2 Answers2

1

You can use numpy where -

df["penalty"] = np.where(df["Average purchase of 3 months"]<60, "$120000", None)

This will result "$120000" where "Average purchase of 3 months" is less than 60 and None otherwise.

Tom Ron
  • 5,906
  • 3
  • 22
  • 38
1

In case you want to do it in a loop

for index, row in df.iterrows():
    if (df.loc[index, 'Average purchase of 3 months'] <= 60):
        df.loc[index, 'Penalty'] = "$120000"
  • No, you don't want to do it in a loop. This is terribly inefficient – roganjosh Dec 28 '22 at 10:02
  • How is it possible that I can check several conditions in a chain? For example, if he was under 60 and that person's name was Kambiz – user17713444 Dec 28 '22 at 10:03
  • @user17713444 you can add conditions to the if statement by using **and**. For example, `if ((df.loc[index, "Average purchase of 3 months"] <=60) and (df.loc[index, "Name"] == "Kambiz")):` – Mughdat Hasanov Dec 28 '22 at 13:02