3

I am trying to do a rolling sum of 1000 rows. I want to sum all rows where ClosePrice is between ClosePrice_low and ClosePrice_high for each respective row and the 999 above it.

for example:

rolling count 1000: check rows 0:1000 and sum if between 0.0000189375 and 0.0000185625 (aka find all rows from 0 to 1000 that are between ClosePrice_low and ClosePrice_high on row 1000 and sum ClosePrice)

rolling count 1001: check rows 1:1001 and sum if between 0.0000189476 and 0.0000185724

doing this below does not work:

tempdf['ClosePrice'][np.where(tempdf['ClosePrice'] < tempdf['ClosePrice_high'] & \
    tempdf['ClosePrice'] > tempdf['ClosePrice_low'],tempdf['ClosePrice'],0)].rolling(1000).sum()

because it will always reference the value of itself which always falls between the high and low.

Also my dataframe is about 4 million rows so I need a fast computation.

Any help would be greatly appreciated!

      ClosePrice  ClosePrice_high  ClosePrice_low
1000   0.00001875     0.0000189375    0.0000185625
1001   0.00001876     0.0000189476    0.0000185724
1002   0.00001868     0.0000188668    0.0000184932
1003   0.00001869     0.0000188769    0.0000185031
1004   0.00001864     0.0000188264    0.0000184536
1005   0.00001855     0.0000187355    0.0000183645
1006   0.00001859     0.0000187759    0.0000184041
1007   0.00001862     0.0000188062    0.0000184338
1008   0.00001875     0.0000189375    0.0000185625
1009   0.00001868     0.0000188668    0.0000184932
1010  0.00001867     0.0000188567    0.0000184833
1011  0.00001862     0.0000188062    0.0000184338
1012  0.00001859     0.0000187759    0.0000184041
1013  0.00001867     0.0000188567    0.0000184833
1014  0.00001871     0.0000188971    0.0000185229
1015  0.00001881     0.0000189981    0.0000186219
1016  0.00001879     0.0000189779    0.0000186021
1017  0.00001877     0.0000189577    0.0000185823
1018  0.00001878     0.0000189678    0.0000185922
1019  0.00001875     0.0000189375    0.0000185625
  • 1
    This is unclear. Do you mean that for row 1000, you want to find all rows from 0 to 1000 that are between ClosePrice_low and ClosePrice_high on row 1000? Or all rows where ClosePrice is between ClosePrice_low and ClosePrice_high for each respective row? – piRSquared Sep 11 '18 at 03:02
  • all rows where ClosePrice is between ClosePrice_low and ClosePrice_high for each respective row and the 999 above it – Federico Marchese Sep 11 '18 at 03:10

1 Answers1

0

In the unlikely event that I understand the question correctly:

df['cpl'] = df.ClosePrice_low.rolling(1000).max()
df['cph'] = df.ClosePrice_high.rolling(1000).min()
df = df[(df.ClosePrice <= df.cph) & (df.ClosePrice >= df.cpl)]
df.drop(['cpl', 'cph'], inplace=True)
df.sum()
Igor Rivin
  • 4,632
  • 2
  • 23
  • 35