1

So I want to vectorize a for loop to speed things up. my code is the following:

import numpy as np
import pandas as pd


def my_func(array, n):
    return pd.Series(array).ewm(span = n, min_periods = n-1).mean().to_numpy()

np.random.seed(0)
data_size = 120000

data = np.random.uniform(0,1000, size = data_size)+29000

loop_size = 1000
step_size = 1

X = np.zeros([data.shape[0], loop_size])
parameter_array = np.arange(1,loop_size+ step_size, step_size)

for i in parameter_array:
    X[:, i-1] = my_func(data, i)

The entire for-loop takes up about a min to finish, which could be a problem for future application. I have already checked the numpy.vectorize(), but it states clearly that it is for convenience only, so using it won't speed up the code by an order of magnitude.

My question is that is there a way to vectorize the for loop like this? If so, can I see a simple example of how this can be done?

Thank you in advance

mathguy
  • 1,450
  • 1
  • 16
  • 33
  • You have to figure out a way of doing that `ewm` for multiple values `n` with one call. – hpaulj May 25 '19 at 05:33
  • @hpaulj I haven't figured out just yet, which is another reason I post this question here. Maybe I should modify the title slightly. – mathguy May 25 '19 at 05:37
  • See if this helps out - https://stackoverflow.com/questions/42869495/ – Divakar May 25 '19 at 06:06
  • @Divakar Thanks for your reply. These new ema functions in that post answers work fine for arrays with the length below a certain number(I believe it is below 15k ish), but when the length is as big as my data's (120k), the outputs from these new ema functions will have a lot of inf. values at the tails. I think the reason for that is when length is big enough, the tail of the vector alpha**np.arange(length) will become zeros, due to having a lot of 1/0's – mathguy May 25 '19 at 06:54

0 Answers0