Python Dataframe Find n rows rolling slope without for loop

Question

I am trying to access n rows of the dataframe and compute mean. The objective is no to use for loop. Because, my df has 30k rows and it may slow it. So, the objective is to use a pandas function to compute n rows mean.

My code:

from scipy import stats 
dfx = pd.DataFrame({'A':[10,20,15,30,1.5,0.6,7,0.8,90,10]}) 
n=2 ## n to cover n samples 
cl_id = dfx.columns.tolist().index('A')  ### cl_id for index number of the column for using in .iloc 
l1=['NaN']*n+[stats.linregress(dfx.iloc[x+1-n:x+1,cl_id].tolist(),[1,2])[0] for x in np.arange(n,len(dfx))]
dfx['slope'] = l1
print(dfx)
      A      slope
0  10.0        NaN
1  20.0        NaN  #stats.linregress([20,10],[1,2])[0] is missing here. Why?
2  15.0       -0.2  #stats.linregress([15,20],[1,2])[0] = 0.2
3  30.0  0.0666667  #stats.linregress([30,15],[1,2])[0] = 0.06667
4   1.5 -0.0350877
5   0.6   -1.11111
6   7.0    0.15625
7   0.8   -0.16129
8  90.0  0.0112108
9  10.0    -0.0125

Everything working fine. Is there a pythonic way of doing it? Like using rolling() function etc.

one of your questions is why your new series starts with two `NaN`s, and the answer is because you are prepending those two values with `l1=['NaN']*n+[...` — RichieV, Sep 04 '20 at 15:06
Does this answer your question? [Pandas - Rolling slope calculation](https://stackoverflow.com/questions/42138357/pandas-rolling-slope-calculation) — RichieV, Sep 04 '20 at 15:09

score 1 · Accepted Answer · answered Sep 04 '20 at 16:06

1

n = 2
dfx.A.rolling(n).apply(lambda x: stats.linregress(x, x.index+1)[0], raw=False)

Output:

0         NaN
1    0.100000
2   -0.200000
3    0.066667
4   -0.035088
5   -1.111111
6    0.156250
7   -0.161290
8    0.011211
9   -0.012500

answered Sep 04 '20 at 16:06

Mohsin hasan

827
5
10

Python Dataframe Find n rows rolling slope without for loop

1 Answers1

Linked