I'm trying to compute a recursive equation on the rows of dataframe with some arguments provided by different other datframes. The equation is provided below and should be performed for each columns of the matrix. It looks like an exponential moving average, except the decay is not constant and given from another dataframe.
Given:
- a matrix Alpha of the same size as the output
- a matrix P of the same size as the output
- a vector M0 of the same width as the output
I did a first try with a double loop (with .iloc):
import pandas as pd
import numpy as np
"""
Assuming inputs:
- Matrix P of size 1000x4
- Matrix alpha of size 1000x4
- Vector M0 of size 1X4
"""
# input variables
height = 1000
width = 4
np.random.seed(1)
P = pd.DataFrame(np.random.normal(loc=170, scale=12, size=(height, width)), index=range(height), columns=range(width))
np.random.seed(1)
alpha = pd.DataFrame(np.random.normal(loc=0.04, scale=0.04, size=(height, width)), index=range(height), columns=range(width))
np.random.seed(1)
M0 = pd.DataFrame(np.random.normal(loc=170, scale=12, size=(height, width)), columns=range(width))
# Output table
MA = P.copy()*0
MA.iloc[0] = M0
# Recursive equation
for x in range(width):
for y in range(1, height):
MA.iloc[y][x] = alpha.iloc[y][x]*P.iloc[y][x] + (1-alpha.iloc)* MA.iloc[y-1][x]
and a second try with vectorization by expanding the probleme into a cumulative prod (see equation below) but failed to retrieve the values expected (code will be updated later):
I could rework my math. However I was wondering if there was any more efficient/simple way to do it as it takes a while.
Thank you for any help !
Update 1: Few comments:
- My original dataframe is a price matrix for different assets (columns) and rows are days ascending downards (past at the top, present at the bottom)
- From there, my intital moving average day depends on a function depending on the asset returning me the initial window. Thus, the algorithm is not column-symetric -My strategy is to loop over the columns, to extract the desired vectors, to perform numpy calculation and to put it back in a dataframe:
Recursive way: I rewrote my code as :
ema = P.copy()*0
for x in ema.columns:
# define which row to start the algorithm
start = max (100, 250, int(windows[x]))
# store index (dates) to be re-inject after numpy calculus
i_d = (p.iloc[start:]).index
# extract corresponding vectors from original matrices
alpha_temp= alpha.iloc[start:][x].values
p_temp = p.iloc[start:][x].values
ema_temp = ema.iloc[start:][x].values
#MO
ema_temp[0] = m0[x]
#recursive equation
for y in range (1, len(ema_temp)):
ema_temp[y] = alpha_temp[y]*p_temp[y]+(1-alpha_temp[y])*ema_temp[y-1]
#transformation into a dtaframe and re-injection in the datframe ema
ema_temp = pd.DataFrame(ema_temp)
ema_temp.index=ema.index[-len(ema_temp):]
ema_temp.columns=[x]
ema.update(ema_temp)
Expansion of the equation
Thank you a_guest for your help.
# This is the product within the summation.
prod = np.flipud(np.cumprod(1 - np.flipud(alpha)))
# This is the sum over the scaled products.
sum_prod = np.cumsum((alpha * P)[:-1] * prod[1:])
# Combining all elements.
result = (alpha * P)[1:] + sum_prod + M0*prod[0]
I tried you code, but i could not provide the right answer. I'm not sure to understand it at 100%.
Assuming my data are downwards, the first row would provide :
I don't understand how it can be used in the second row as it already includes 1-a_n everywhere.
Thanks a lot !