-1

I have dataframes with multiple rows and columns depicting a stock's prices. Columns depicting open price, close price etc along with other formulas based on the those prices. Rows depicting time per min. The first few rows of the column 'formula' may not have data since it requires minimum number of rows before the first value under 'formula' will show up.

enter image description here

You may download a sample CSV here.

I have another set of code that simply draws the last row and compare to the previous rows. It's a simple if/elif/else python which seem to work, please correct me if I'm wrong or if there is a better way.

if df.close[-1] >= 30:
        cond17 = df.formula[-4:-1].max() <= 5
    elif 15 <= df.close[-1] < 30:
        cond17 = df.formula[-4:-1].max() <= 3
    elif 5 <= df.close[-1] < 15:
        cond17 = df.formula[-4:-1].max() <= 1
    elif 2.5 <= df.close[-1] < 5:
        cond17 = df.formula[-4:-1].max() <= 0.5
    else:
        cond17 = df.formula[-4:-1].max() <= 0.02

I'm having trouble trying to get this set of conditions to output the values (True / False) under a new column "cond17". I searched and read that np.where and np.select are the options to go for, since they are fast. But tried the following and got the respective errors, listed below.

Please advise how should I apply for this case!

np.where - using nested wheres.

    df.loc[:, ('Mcond17')] = np.where(df.close >= 30, df.formula.shift(1).tail(3) <= 5,
        (np.where(15 <= df.close < 30, df.formula.shift(1).tail(3) <= 3,
        (np.where(5 <= df.close < 15, df.formula.shift(1).tail(3) <= 1,
        (np.where(2.5 <= df.close < 5, df.formula.shift(1).tail(3) <= 0.5,
        df.formula.shift(1).tail(3) <= 0.02)))))))

Got the following error:

    (np.where(15 <= df.close < 30, df.tenkanspanbdist.shift(1).tail(3) <= 3,
  File "C:\Users\***\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py", line 1326, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

np.select - read that if more than 2 conditions, I should use this.

cond17list = [(df.close >= 30), (15 <= df.close < 30), (5 <= df.close < 15), (2.5 <= df.close < 5)]
cond17choices = [(df.formula[-4:-1].max() <= 5), (df.formula[-4:-1].max() <= 3), (df.formula[-4:-1].max() <= 1), (df.formula[-4:-1].max() <= 0.5)]
df.loc[:, ('cond17')] = np.select(cond17list, cond17choices, default=(df.formula[-4:-1].max() <= 0.02))

Got the following error:

cond17list = [(df.close >= 30), (15 <= df.close < 30), (5 <= df.close < 15), (2.5 <= df.close < 5)]
File "C:\Users\***\AppData\Roaming\Python\Python38\site-packages\pandas\core\generic.py", line 1326, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Garrad2810
  • 113
  • 6
  • What should the new column show for each of the conditions? – norie Jun 22 '21 at 12:14
  • This question would benefit from a [good pandas example](https://stackoverflow.com/a/20159305/11659881). It makes it a lot easier for us to create a solution and then test it. – Kraigolas Jun 22 '21 at 12:15
  • @norie, the new column should output True / False based on the conditions – Garrad2810 Jun 22 '21 at 12:34
  • 1
    @Kraigolas, noted. Provided a CSV sample file. Sorry have not learned how to provide code that generates the data. – Garrad2810 Jun 22 '21 at 12:46
  • Is there anything else lacking in this question? Need help! – Garrad2810 Jun 22 '21 at 14:46
  • You have 4 different conditions, does that not mean you would have 4 possible l values returned in the new column rather than only True/False? – norie Jun 22 '21 at 16:31

1 Answers1

0

Rechecked my code, turns out it was a simple mistake.

I shouldn't have used .tail() since that checks only the last rows of the dataframe.

I needed to check every row, thus should have used .rolling() instead.

Garrad2810
  • 113
  • 6