I'm trying to determine how many days in a row a certain column is above or below a certain threshold.
>>> df.head()
Open High Low Close Volume
Date
2004-08-19 49.96 51.98 47.93 50.12 NaN
2004-08-20 50.69 54.49 50.20 54.10 NaN
2004-08-23 55.32 56.68 54.47 54.65 NaN
2004-08-24 55.56 55.74 51.73 52.38 NaN
2004-08-25 52.43 53.95 51.89 52.95 NaN
>>>
For the example above, I'd like another column df['RDA'] to increment for each consecutive day that the column Open is over 50. For each consecutive day below 50, I'd like a second column df['RDB'] to increment and df['RDA'] to reset to 0. I've tried if/then logic but it doesn't like that and gives me a value error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). how can i sort it out
I'd like the output of my dataframe to look like this:
>>> df.head()
Open High Low Close Volume RDA RDB
Date
2004-08-19 51.96 51.98 47.93 50.12 NaN 1 0
2004-08-20 50.69 54.49 50.20 54.10 NaN 2 0
2004-08-23 55.32 56.68 54.47 54.65 NaN 3 0
2004-08-24 45.56 55.74 51.73 52.38 NaN 0 1
2004-08-25 42.43 53.95 51.89 52.95 NaN 0 2
2004-08-26 41.96 51.98 47.93 50.12 NaN 0 3
2004-08-27 40.69 54.49 50.20 54.10 NaN 0 4
2004-08-28 55.32 56.68 54.47 54.65 NaN 1 0
2004-08-29 55.56 55.74 51.73 52.38 NaN 2 0
2004-08-30 52.43 53.95 51.89 52.95 NaN 3 0
>>>
Is this something that's possible with Pandas? I know you can get a count of the values in a column, but I've been so far unable to find a method for with consecutive values. An if/then statement with 2 variables would work, but like I mentioned above, I get a value error when I tried that. Any help would be appreciated.