1

I'm trying to create a dataframe of stock prices, and append a True/False column for each row based on certain conditions.

ind = [0,1,2,3,4,5,6,7,8,9]
close = [10,20,30,40,30,20,30,40,50]
open = [11,21,31,41,31,21,31,41,51]
upper = [11,21,31,41,31,21,31,41,51]
mid = [11,21,31,41,31,21,31,41,51]
cond1 = [True,True,True,False,False,True,False,True,True]
cond2 = [True,True,False,False,False,False,False,False,False]
cond3 = [True,True,False,False,False,False,False,False,False]
cond4 = [True,True,False,False,False,False,False,False,False]
cond5 = [True,True,False,False,False,False,False,False,False]

def check_conds(df, latest_price):
    ''''1st set of INT for early breakout of bollinger upper''' 
    df.loc[:, ('cond1')] = df.close.shift(1) > df.upper.shift(1)
    df.loc[:, ('cond2')] = df.open.shift(1) < df.mid.shift(1).rolling(6).min()
    df.loc[:, ('cond3')] = df.close.shift(1).rolling(7).min() <= 21
    df.loc[:, ('cond4')] = df.upper.shift(1) < df.upper.shift(2)
    df.loc[:, ('cond5')] = df.mid.tail(3).max() < 30
    df.loc[:, ('Overall')] = all([df.cond1,df.cond2,df.cond3,df.cond4,df.cond5])    
    return df

The original 9 rows by 4 columns dataframe contains only the close / open / upper / mid columns.

that check_conds functions returns the df nicely with the new cond1-5 columns returning True / False appended for each row, resulting in a dataframe with 9 rows by 9 columns.

However when I tried to apply another logic to provide an 'Overall' True / False based on cond1-5 for each row, I receive that "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

df.loc[:, ('Overall')] = all([df.cond1,df.cond2,df.cond3,df.cond4,df.cond5])

So I tried pulling out each of the cond1-5, those are indeed series of True / False. How do I have that last line in the function to check each row's cond1-5 and return a True if all cond1-5 are True for that row?

Just can't wrap my head why those cond1-5 lines in the function works ok, just comparing the values within each row, but this above last line (written in similar style) is returning an entire series.

Please advise!

Garrad2810
  • 113
  • 6

1 Answers1

1

The error tells you to use pd.DataFrame.all. To check that all values are true per row for all conditions you have to specify the argument axis=1:

df.loc[:, df.columns.str.startswith('cond')].all(axis=1)

Note that df.columns.str.startswith('cond') is just a lazy way of selecting all columns that start with 'cond'. Of course you can achieve the same with df[['cond1', 'cond2', 'cond3', 'cond4', 'cond5']].

gofvonx
  • 1,370
  • 10
  • 20
  • i got another question based on the same dataframe. probably the same issue but different variant i think. Hope to pick your brain on this, thanks! [link](https://stackoverflow.com/q/67146925/4197720) – Garrad2810 Apr 18 '21 at 13:00