2

I have a dataset where I have two time stamp columns, one is the start time and the other is the end time. I have calculated the difference and also stored it in another column in the dataset. Based on the difference column of the dataset, I want to fill in a value in another column. I am using for loop and if else for the same but upon execution, the error "The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()" appears

   Time_df = pd.read_excel('filepath')

   print(Time_df.head(20))

   for index, rows in Time_df.head().iterrows():
         if(Time_df["Total Time"] < 6.00 ):
             Time_df["Code"] = 1

   print(Time_df.head(20))  

In Total Downtime, wherever a less than 6 is encountered, it will put 1 in the column code. However, I get the error as stated in the question.

sumitpal0593
  • 194
  • 1
  • 3
  • 18

4 Answers4

5

Try with np.where():


df["Code"]= np.where(df["Total Time"]<6.00,1,df["Code"])

Explanation:

#np.where(condition, choice if condition is met, choice if condition is not met)
#returns an array explained above
anky
  • 74,114
  • 11
  • 41
  • 70
2

To fix your code

   print(Time_df.head(20))

   for index, rows in Time_df.head().iterrows():
         if(rows["Total Time"] < 6.00 ):
             Time_df.loc[index,"Code"] = 1

   print(Time_df.head(20))  
BENY
  • 317,841
  • 20
  • 164
  • 234
2

This happens to me a lot. In if (Time_df["Total Time"] < 6.00 ), (Time_df["Total Time"] < 6.00 ) is a series and Python does not know how to evaluate the series as a Boolean. Depending on what you want, but most likely you want to do:

Time_df.loc[Time_df["Total Time"] < 6.00, "Code"] = 1

which puts 1 in column "Code" wherever "Total Time" is < 6.

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1
def myfn(row):
    if row['Total Time'] < 6:
        return 1


time_df['code'] = time_df.apply(lambda row: myfn(row), axis=1)
IWHKYB
  • 481
  • 3
  • 11