0

I want to create a simple LSTM which will try to predict stock prices based on 3 variables open_price, close_price and volume.

  1. I want to predict the open_price and close_price only
  2. I want to predict the prices for the next 5 days
  3. I want to predict the prices based on 10 previous days
  4. I believe that tomorrow's prices will be affected by both today's prices and voulme

I've been playing with this example, which uses just one variable (price). But I can't figure out how to make it work with multiple variables.

Please remember that while I'm only predicting prices from the past, I still want my network to use volume data from the past, as it might carry some information about prices changes.

Here is my code using some random values:

from keras.layers.recurrent import LSTM
from keras.models import Sequential
import numpy as np

my_input_dim = ???
my_output_dim = ???

model = Sequential()
model.add(LSTM(
        input_dim=my_input_dim,
        output_dim=my_output_dim,
        return_sequences=True))        

model.compile(loss='mse', optimizer='adam')

rows = 100
# index 0: open_price, index 1: close_price, index 2: volume
dataset = np.random.rand(rows,3)

my_x_train = ???
my_y_train = ???

model.fit(
        my_x_train,
        my_y_train,
        batch_size=1,
        nb_epoch=10,
        validation_split=0.20)

data_from_5_previous_days = np.random.rand(5,3)

# should be a 2D array of lenght 5 (prices for the next 5 days)
prediction = model.predict(data_from_5_previous_days)

Could you please help me finish it?

EDIT

I think I made some progress here:

timesteps = 5
out_timesteps = 5
input_features = 4
output_features = 3
xs = np.arange(20 * timesteps * input_features).reshape((20,timesteps,input_features)) # (20,5,4)
ys = np.arange(20 * out_timesteps * output_features).reshape((20,out_timesteps,output_features)) # (20,5,3)
model = Sequential()
model.add(LSTM(13, input_shape=(timesteps, input_features), return_sequences=True))
model.add(LSTM(17, return_sequences=True))
model.add(LSTM(output_features, return_sequences=True)) # output_features (output dim) needs to be the same length as the length of a sample from ys (int the LAST layer)
model.compile(loss='mse', optimizer='adam')
model.fit(xs,ys,batch_size=1,nb_epoch=1,validation_split=0.20)
prediction = model.predict(xs[0:2]) # (20,5,3)

So my single sample is

  • input: a sequence (5 items long), and every item has 4 features
  • output: a sequence (5 items long), and every item has 3 features

However I'm unable to change the output sequence length (out_timesteps = 2). I get an error:

ValueError: Error when checking target: expected lstm_143 to have shape (None, 5, 3) but got array with shape (20, 2, 3)

I think about is like that: based on last 5 days, predict next 2 days.

What can I do with that error?

Andrzej Gis
  • 13,706
  • 14
  • 86
  • 130
  • 1
    Your input data is an array of dimension (10,3): 10 days, 3 values. Your output is only (5,1): 5 prices. You need to add at least a Dense layer after the LSTM to be able to condense the information of the LSTM into your required outputs. I'm pretty sur that you can find more info in the keras examples, – Daniel GL Oct 10 '17 at 07:26
  • I answered a very similar question [here](https://stackoverflow.com/questions/45764629/machine-learning-how-to-use-the-past-20-rows-as-an-input-for-x-for-each-y-value/45765082#45765082) that should clear things up,except maybe you want 5 output nodes instead of 1. – DJK Oct 10 '17 at 17:36
  • Possible duplicate of [Neural Network LSTM input shape from dataframe](https://stackoverflow.com/questions/39674713/neural-network-lstm-input-shape-from-dataframe) – charlesreid1 Oct 10 '17 at 20:30
  • @djk47463 The number of outputs makes a great difference here. (return_sequences=True, changes the output shape). Could you please take a look at the updated question? – Andrzej Gis Oct 13 '17 at 18:23
  • @charlesreid1 The question you linked was very helpful, but didn't solve the whole problem. Please take a look at the update. – Andrzej Gis Oct 13 '17 at 18:24
  • @DanielGL Please look at the updated question. Does this approach still need a dense layer? I think I got it right with a small exception that I'm unable to control the length of the output sequence – Andrzej Gis Oct 13 '17 at 18:26
  • I think @djk47463 is right, his examples explains our point. You should understand better the LSTM and check the keras examples to understand why. Using only LSTM, your input and output will both have the same timesteps. – Daniel GL Oct 16 '17 at 13:42

0 Answers0