Function to calculate value from first row in Python Pandas

Question

Is there any function in pandas to simulate excel formula like '=sum($A$1:A10'(for 10th row), i.e. the formula should take rolling data from 1st row.

Pandas rolling function needs a integer value as window argument.

Please check [How to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — jezrael, Sep 21 '16 at 11:55
Instead of the 30 argument for rolling window for rolling, is there any way to make it dynamic(row number). rolling(30).apply(func=np.percentile,args=(0.90,)). Like the excel $A$1:current cell formulae? — Chandan Kumar, Sep 21 '16 at 12:00

score 2 · Accepted Answer · edited May 23 '17 at 12:08

2

The equivalent of =SUM($A$1:A1) in pandas is .expanding().sum() (requires pandas 0.18.0):

ser = pd.Series([1, 2, 3, 4])

ser
Out[3]: 
0    1
1    2
2    3
3    4
dtype: int64

ser.expanding().sum()
Out[4]: 
0     1.0
1     3.0
2     6.0
3    10.0

You can also apply a generic function via apply:

ser.expanding().apply(lambda x: np.percentile(x, 90))
Out: 
0    1.0
1    1.9
2    2.8
3    3.7
dtype: float64

Or directly with quantile:

ser.expanding().quantile(0.9)
Out[15]: 
0    1.0
1    1.0
2    2.0
3    3.0
dtype: float64

Note that 90th percentile is equal to 0.9th quantile. However, Series.quantile and Series.expanding.quantile are returning different results which is probably a bug.

np.percentile returns the same results as Excel's PERCENTILE.INC. For PERCENTILE.EXC, I've previously wrote a small function here.

edited May 23 '17 at 12:08

Community

1
1

answered Sep 21 '16 at 12:05

ayhan

70,170
20
182
203

Thanks..this looks like exactly what is needed but the percentile values dont match with excel values(can see that they are not correct even visually). percentile(0.9) should be 1 1.93 2.86 3.79 – Chandan Kumar Sep 21 '16 at 12:12
@ChandanKumar For 90th percentile you need to pass 90 as the parameter. However, Series.expanding.quantile seems buggy so it might be better to use numpy version right now. – ayhan Sep 21 '16 at 18:37

Function to calculate value from first row in Python Pandas

1 Answers1