2

I am trying to figure out how I can mark the rows where the price are part of 4 increase prices . the "is_consecutive" is actually the mark .

I managed to do the diff between the rows :

df['diff1'] = df['Close'].diff()

But I didn't managed to find out which row is a part of 4 increase prices .

I had a thought to use df.rolling() .

The exmple df,

On rows 0-3 , we need to get an output of 'True' on the ["is_consecutive"] column , because the ['diff1'] on this consecutive rows is increase for 4 rows .

On rows 8-11 , we need to get an output of 'False' on the ["is_consecutive"] column , because the ['diff1'] on this consecutive rows is zero .

   Date      Price           diff1    is_consecutive   
0  1/22/20    0               0          True
1  1/23/20    130            130         True
2  1/24/20    144            14          True
3  1/25/20    150            6           True
4  1/27/20    60            -90          False
5  1/28/20    95             35          False
6  1/29/20    100            5           False
7  1/30/20    50            -50          False
8  2/01/20    100            0           False
9  1/02/20    100            0           False
10  1/03/20   100            0           False
11  1/04/20   100            0           False
12  1/05/20   50            -50          False

general example :

if price = [30,55,60,65,25]

the different form the consecutive number on the list will be :

diff1 = [0,25,5,5,-40]

So when the diff1 is plus its actually means the consecutive prices are increase .

I need to mark(in the df) the rows that have 4 consecutive that go up.

Thank You for help (-:

Mark
  • 51
  • 6
  • can you supply code to build the dataframe and make it easier for us to help? – NuLo Sep 17 '21 at 16:02
  • you can use the rolling function dee https://stackoverflow.com/questions/13996302/python-rolling-functions-for-groupby-object for use case – gal peled Sep 17 '21 at 16:06

3 Answers3

4

Try: .rolling with window of size 4 and min periods 1:

df["is_consecutive"] = (
    df["Price"]
    .rolling(4, min_periods=1)
    .apply(lambda x: (x.diff().fillna(0) >= 0).all())
    .astype(bool)
)
print(df)

Prints:

      Date  Price  is_consecutive
0  1/22/20      0            True
1  1/23/20    130            True
2  1/24/20    144            True
3  1/25/20    150            True
4  1/26/20     60           False
5  1/26/20     95           False
6  1/26/20    100           False
7  1/26/20     50           False
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • 1
    nice one Andrej ;) +1 – mozway Sep 17 '21 at 16:09
  • 1
    Hey Thank you , but its count consecutive prices have the same values too , I tried to change for : (lambda x: (x.diff().fillna(0) >= 0.001) but it didn't work , can you please help me with that ? (-: – Mark Sep 17 '21 at 16:26
  • @Andrej pls share how to check if 4 consecutive rows were values going down? – Divyank Aug 21 '22 at 11:08
1

Assuming the dataframe is sorted. One way is based on the cumsum of the differences to identify the first time an upward Price move succeeding a 3 days upwards trend (i.e. 4 days of upward trend).

quant1 = (df['Price'].diff().apply(np.sign) == 1).cumsum()
quant2 = (df['Price'].diff().apply(np.sign) == 1).cumsum().where(~(df['Price'].diff().apply(np.sign) == 1)).ffill().fillna(0).astype(int)
df['is_consecutive'] = (quant1-quant2) >= 3

note that the above takes into account only strictly increasing Prices (not equal).

Then we override also the is_consecutive tag for the previous 3 Prices to be also TRUE using the win_view self defined function:

def win_view(x, size):
    if isinstance(x, list):
        x = np.array(x)
    if isinstance(x, pd.core.series.Series):
        x = x.values
    if isinstance(x, np.ndarray):
        pass
    else:
        raise Exception('wrong type')
    return np.lib.stride_tricks.as_strided(
        x,
        shape=(x.size - size + 1, size),
        strides=(x.strides[0], x.strides[0])
    )


arr = win_view(df['is_consecutive'], 4)
arr[arr[:,3]] = True

Note that we inplace replace the values to be True.

EDIT 1 Inspired by the self defined win_view function, I realized that the solution it can be obtained simply by win_view (without the need of using cumsums) as below:

df['is_consecutive'] = False
arr = win_view(df['Price'].diff(), 4)
arr_ind = win_view(list(df['Price'].index), 4)
mask = arr_ind[np.all(arr[:, 1:] > 0, axis=1)].flatten()
df.loc[mask, 'is_consecutive'] = True

We maintain 2 arrays, 1 for the returns and 1 for the indices. We collect the indices where we have 3 consecutive positive return np.all(arr[:, 1:] > 0, axis=1 (i.e. 4 upmoving prices) and we replace those in our original df.

MaPy
  • 505
  • 1
  • 6
  • 9
0

The function will return columns named "consecutive_up" which represents all rows that are part of the 5 increase series and "consecutive_down" which represents all rows that are part of the 4 decrees series.

def c_func(temp_df):

     temp_df['increase'] = temp_df['Price'] > temp_df['Price'].shift()
     temp_df['decrease'] = temp_df['Price'] < temp_df['Price'].shift()

     temp_df['consecutive_up'] = False
     temp_df['consecutive_down'] = False

     for ind, row in temp_df.iterrows():
          if row['increase'] == True:
               count += 1
          else:
               count = 0
          if count == 5:
               temp_df.iloc[ind - 5:ind + 1, 4] = True
          elif count > 5:
               temp_df.iloc[ind, 4] = True

     for ind, row in temp_df.iterrows():
          if row['decrease'] == True:
               count += 1
          else:
               count = 0
          if count == 4:
               temp_df.iloc[ind - 4:ind + 1, 5] = True
          elif count > 4:
               temp_df.iloc[ind, 5] = True
     return temp_df
RiveN
  • 2,595
  • 11
  • 13
  • 26
Mark
  • 51
  • 6