0

The code below tries to solve the following task: "Find the maximum price change over any 5-day rolling window, over 1000-day period".

By "any 5-day rolling window", I don't just mean "t_i + 5", but rather "t_i + j", where "i" varies from 1 to 1000 and "j" varies from 1 to 5.

I have tried to use Numpy native functions, but I still ended up using a "for-loop" for the inner iteration. Here goes the code:

prices = npr.random([1000,1])*1000

max_array = np.zeros([(prices.size-5),1])
for index, elem in np.ndenumerate(prices[:-5,:]):
    local_max = 0.0
    for i in range(1,6,1):
        price_return = prices[(index[0] + i),0] / elem
        local_max = max(local_max, price_return)
    max_array[index[0]] = local_max
global_max = np.amax(max_array)

Can I somehow eliminate the inner for loop and use Numpy vectorization (somehow) instead?

Also, I don't particularly like using "index[0]" to extract the actual index of the current loop from the tuple object that is returned into the variable "index" via the call:

for index, elem in np.ndenumerate(prices[:-5,:]):

Can that be also imporved?

Jan Stuller
  • 177
  • 3
  • 8
  • "Find the maximum price change over any 5-day rolling window, over 1000-day period". If you explicitly look for a 5-day window, why do you need the inner loop at all? You're just interested in the maximum diff between you min and max of each window iteration. No benefit of checking 1-4 days as well, the result will be the same – po.pe Jul 03 '20 at 11:45
  • I don't think the result wil be the same. The maximum change within any 5-day rolling windown does not necessarily arise from just the first and the last day in the 5-day window. I am interested in the maximum change that could happen between any two days within any 5-day window. – Jan Stuller Jul 03 '20 at 11:51
  • Okay, wasn't clear to me that the change has to be within two consecutive days – po.pe Jul 03 '20 at 11:54

1 Answers1

1

Using pandas rolling window for min and max

Allows computation without for loops

Inspired by Max in a sliding window in NumPy array

import pandas as pd
import numpy as np

# Generate Data
prices = np.random.random([1000,1])*1000
prices = prices.flatten()

# Pandas rolling window (max in 5 day period)
# Convert series back to numpy array
maxs = pd.Series(prices).rolling(5).max().dropna().to_numpy()

# Pandas rolling window (min in 5 day period)
# Convert series back to numpy array
mins = pd.Series(prices).rolling(5).min().dropna().to_numpy()

# Numpy subtraction to find max and min differnce
delta = maxs - mins

Results (show first 10 elements)

print('prices: ', prices[:10])
print('maxs: ', maxs[:10])
print('mins: ', mins[:10])
print('max-change: ', delta[:10])

Output (first 10 elements)

prices:  [416.67356904 244.29395291 325.50608035 102.67426207 794.36067353
 318.22836941 113.48811096 898.87130071 303.06297351 285.80963998]
maxs:  [794.36067353 794.36067353 794.36067353 898.87130071 898.87130071
 898.87130071 898.87130071 898.87130071 828.87148828 828.87148828]
mins:  [102.67426207 102.67426207 102.67426207 102.67426207 113.48811096
 113.48811096 113.48811096 285.80963998 285.80963998 106.4036413 ]
max-change:  [691.68641146 691.68641146 691.68641146 796.19703863 785.38318975
 785.38318975 785.38318975 613.06166073 543.06184831 722.46784698]
DarrylG
  • 16,732
  • 2
  • 17
  • 23
  • Thank you so much. So it seems that efficient coding in Python comes down (amongst other considerations) to consciously selecting the most suitable library for the specific task? I find this a bit frustrating, because in C-type language, learning how to use loops efficiently would be good enough to solve this type of problem. In Python, learning each library is almost like learning a new coding language in itself...so really, learning efficient Python comes down to learning the specific libaries. – Jan Stuller Jul 03 '20 at 12:13
  • @JanStuller--I updated my answer with the source of my inspiration for my current answer to show how to make it less frustrating. Looking at [stackoverflow tag count](https://stackoverflow.com/tags) we see that Python is in the top 3 in Q&A on stackoverflow. So, I 've found the best approach is to first look to see if a similar problem has been solved before. This then will point me at best options for library and functions within the library. – DarrylG Jul 03 '20 at 12:23
  • Thank you once again. Btw, on a second look, the method you propose only returns the max over a rolling 5-day window. I am however interested in the max difference between **any two days** within **any five-day rolling window**. So there is still the need for the inner loop which goes from "1" to "5" within all 5-day rolling windows. – Jan Stuller Jul 03 '20 at 12:58
  • @JanStuller--let me understand: if prices = [6, 2, 3, 4, 5, 1]. Then using a 5 day rolling window, wouldn't maxs = [6, 5] and min = [2, 1], so max-change = [4, 4]? Is this correct? If so, this is what the answer does. If not, can you explain what maxs, mins, and max-change should be? – DarrylG Jul 03 '20 at 13:18
  • You are correct. I see now that your code does the job! Smart way of doing it, I like it very much. – Jan Stuller Jul 03 '20 at 13:18