0

I have the following sample of the Dataframe (population rows 100k+):

In:

    official        delta
    0               0.000201567           
    0               0.000194400            
    0               0.000151906            
    62.94957331     0.000144387            
    64.06471633     0.000125152            
    64.51335098     0.000133459            
    64.4101024      0.000120795            
    0               0.000146456

but receive the following output:

official        delta               result
0               0.000201567         0
0               0.0001944           0 
0               0.000151906         0
62.94957331     0.000144387         0
64.06471633     0.000125152         0
64.51335098     0.000133459         0
64.4101024      0.000120795         0
0               0.000146456         0

Desired solution:

official     delta          result
0            0.000201567    0
0            0.0001944      0
0            0.000151906    0
62.94957331  0.000144387    62.94957331
64.06471633  0.000125152    64.06471633
64.51335098  0.000133459    64.51335098
64.4101024   0.000120795    64.4101024
0            0.000146456    63.76600137

I tried the following code although it seems that it does not work correctly. I do not understand why it gives a fault result. When I execute it in a demo dataframe, everything is fine.

The code should pick up the 'official' element when mask is True otherwise multiply its previous element with 0.99 . The issue here is that when the mask is True, the code does not pick the 'official' element.

mask = (df['official']<51) & (df['delta']>0)

df['result'] = df['official'].where(mask,0.99).groupby(~mask.cumsum()).cumprod()
user3203275
  • 195
  • 2
  • 11
  • You're calling mask as a function the second time - `~mask()` - when it probably shouldn't be. Try `df['result'] = df['official'].where(mask,0.99).groupby(~mask).cumprod()` – asongtoruin Sep 14 '18 at 14:56
  • sorry it was a typo, wrong copy paste. see now what my code is. – user3203275 Sep 14 '18 at 21:04
  • You say you want the code to use the official element when mask is True, but your mask will be False when official < 51, even though your desired output shows you want 62.94 to give a result of 62.94. There's a sign error somewhere. – DSM Sep 14 '18 at 21:21
  • Sorry but I don't get you. Index rows 0,1,2 mask is false; 3,4,5,6 mask is true; 7 mask is false. – user3203275 Sep 16 '18 at 11:55

1 Answers1

0
mask = ~((df['official'] < 51) & (df['delta'] > 0))    
df['result1'] = df['official'].where(mask, 0.99*df['official'].shift(1)).fillna(0.0)

Add a unary operator to mask with ~. If the mask element is True, that element, otherwise, shift 'official' down by one row and multiply by 0.99. Fill the first element which will be NaN caused by the shift.

Result:

    official     delta     result    result1
0   0.000000  0.000202   0.000000   0.000000
1   0.000000  0.000194   0.000000   0.000000
2   0.000000  0.000152   0.000000   0.000000
3  62.949573  0.000144  62.949573  62.949573
4  64.064716  0.000125  64.064716  64.064716
5  64.513351  0.000133  64.513351  64.513351
6  64.410102  0.000121  64.410102  64.410102
7   0.000000  0.000146  63.766001  63.766001
Jarad
  • 17,409
  • 19
  • 95
  • 154