I have a df
like this,
ID Machine 17-Dec 18-Jan 18-Feb 18-Mar 18-Apr 18-May
160 Car 348 280 274 265 180 224
163 Var 68248 72013 55441 64505 71097 78006
165 Assus 1337 1279 1536 1461 1555 1700
215 Owen 118 147 104 143 115 153
I calculates the Mean and Std. Dev like this,
df['Avg'] = np.mean(all_np_values, axis=1)
df['Std.Dev'] = np.std(all_np_values, axis=1)
Then I get the following data frame.
ID Machine 17-Dec 18-Jan 18-Feb 18-Mar 18-Apr 18-May Mean Std.Dev
160 Car 348 280 274 265 180 224 261.83 51.70
163 Var 68248 72013 55441 64505 71097 78006 68218.33 7018.24
165 Assus 1337 1279 1536 1461 1555 1700 1478 140.44
215 Owen 118 147 104 143 115 153 130 18.40
Now, I want to have a final dataframe that looks like below, which I would like to look at MAY 18
and say yes
or no
based on its value Above
or Below
2 standard deviation.
ID Machine 17-Dec 18-Jan 18-Feb 18-Mar 18-Apr 18-May Mean Std.Dev Above Below
160 Car 348 280 274 265 180 224 261.83 51.70 No No
163 Var 68248 72013 55441 64505 71097 78006 68218.33 7018.24 No No
165 Assus 1337 1279 1536 1461 1555 1700 1478 140.44 No No
215 Owen 118 147 104 143 115 153 130 18.40 No No
I tried to do the following,
for value in df['18-May']:
if value > (df['Avg'] + 2 * df['Std.Dev']):
df['Above'] = 'Yes'
else:
df['Above'] = 'No'
This gives me an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I understand the error after reading some older posts. My conclusion is, it returns bool
values for comparison.
Not sure, how to mask
in creating a new df
column to create that 'Yes' and 'No' in my 'Above' or 'Below' column. How can I add that into my code above?
Any thoughts would be helpful.