I have dataframes with multiple rows and columns depicting a stock's prices. Columns depicting open price, close price etc along with other formulas based on the those prices. Rows depicting time per min. The first few rows of the column 'formula' may not have data since it requires minimum number of rows before the first value under 'formula' will show up.
You may download a sample CSV here.
I have another set of code that simply draws the last row and compare to the previous rows. It's a simple if/elif/else python which seem to work, please correct me if I'm wrong or if there is a better way.
if df.close[-1] >= 30:
cond17 = df.formula[-4:-1].max() <= 5
elif 15 <= df.close[-1] < 30:
cond17 = df.formula[-4:-1].max() <= 3
elif 5 <= df.close[-1] < 15:
cond17 = df.formula[-4:-1].max() <= 1
elif 2.5 <= df.close[-1] < 5:
cond17 = df.formula[-4:-1].max() <= 0.5
else:
cond17 = df.formula[-4:-1].max() <= 0.02
I'm having trouble trying to get this set of conditions to output the values (True / False) under a new column "cond17". I searched and read that np.where and np.select are the options to go for, since they are fast. But tried the following and got the respective errors, listed below.
Please advise how should I apply for this case!
np.where - using nested wheres.
df.loc[:, ('Mcond17')] = np.where(df.close >= 30, df.formula.shift(1).tail(3) <= 5,
(np.where(15 <= df.close < 30, df.formula.shift(1).tail(3) <= 3,
(np.where(5 <= df.close < 15, df.formula.shift(1).tail(3) <= 1,
(np.where(2.5 <= df.close < 5, df.formula.shift(1).tail(3) <= 0.5,
df.formula.shift(1).tail(3) <= 0.02)))))))
Got the following error:
(np.where(15 <= df.close < 30, df.tenkanspanbdist.shift(1).tail(3) <= 3,
File "C:\Users\***\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py", line 1326, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
np.select - read that if more than 2 conditions, I should use this.
cond17list = [(df.close >= 30), (15 <= df.close < 30), (5 <= df.close < 15), (2.5 <= df.close < 5)]
cond17choices = [(df.formula[-4:-1].max() <= 5), (df.formula[-4:-1].max() <= 3), (df.formula[-4:-1].max() <= 1), (df.formula[-4:-1].max() <= 0.5)]
df.loc[:, ('cond17')] = np.select(cond17list, cond17choices, default=(df.formula[-4:-1].max() <= 0.02))
Got the following error:
cond17list = [(df.close >= 30), (15 <= df.close < 30), (5 <= df.close < 15), (2.5 <= df.close < 5)]
File "C:\Users\***\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py", line 1326, in __nonzero__
raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().