-1

I'm working with financial time series data and a bit confused with numpy reshape function. My goal is to calculate log-returns for adj_close parameter.

inputs = np.array([df_historical_data[key][-W:], axis = 1).values for key in stock_list])
inputs.shape //(8, 820, 5)

prices = inputs[:, :, 0]
prices.shape //(8, 820)
prices[:,0]
array([  4.17000004e+02,   4.68800000e+00,   8.47889000e-03,
     3.18835850e+00,   3.58412583e+00,   8.35364850e-01,
     5.54610005e-04,   3.33600003e-05]) //close prices of 8 stock for 0 day

However for my program, I need the shape of my inputs be (820, 8, 5) so I decided to reshape my numpy array

inputs = np.array([df_historical_data[key][-W:], axis = 1).values for key in stock_list]).reshape(820, 8, 5)
inputs.shape //(820, 8, 5)

prices = inputs[:, :, 0]
prices.shape //(820, 8)
prices[0]
array([ 417.00000354,  436.5100001 ,  441.00000442,  440.        ,
        416.10000178,  409.45245   ,  422.999999  ,  432.48000001]) 
// close price of 1 stock for 8 days
// but should be the same as in the example above

Seems that I didn't reshaped my array properly. Anyway I can't understand why such strange behaviour occurs.

Daniel Chepenko
  • 2,229
  • 7
  • 30
  • 56

1 Answers1

1

What you need is transpose not reshape.

Let's assume we have an array as follows:

import numpy as np

m, w, l = 2, 3, 4
array1 = np.array([[['m%d w%d l%d' % (mi, wi, li) for li in range(l)] for wi in range(w)] for mi in range(m)])
print(array1.shape)
print(array1)

Reshape is probably not what you want, but here is how can you do it:

array2 = array1.reshape(w, m, l)
print(array2.shape)
print(array2)

Here is how transpose is done:

#                         originally
#                         0, 1, 2
#                         m, w, l
#                         -------

#                         transposed
array3 = array1.transpose(1, 0, 2)
#                         w, m, l

print(array3.shape)
print(array3)
Szabolcs Dombi
  • 5,493
  • 3
  • 39
  • 71