2

I'm trying to build a recurrent neural network using Keras. I'm using as base the discussion presented here.

However, in the solution proposed on the original discussion, as far as I understand, there is no concept of an "episode". Let me explain what I mean by that.

Imagine you have 6 instances x1,x2,x3,x4,x5,x6. Given a recurrent window of size 3, the first output is at x3. I'm referring it as y3. So, the input-output pairs without the episode concept look like that:

  • [x1, x2, x3], [y3]
  • [x2, x3, x4], [y4]
  • [x3, x4, x5], [y5]
  • [x4, x5, x6], [y6]

My data, however, have well defined boundaries. I would have two episodes in the example, so the training pairs look like that:

  • [x1, x2, x3], [y3]
  • [x4, x5, x6], [y6]

My question: is it possible to do this in Keras?

How should I keep my input-output pair organization? The network should produce no prediction (no output) for all inputs but x3 and x6.

PS: I may use LSTM or classical recurrence. In the case there is a solution using LSTM, I would like to be able to reset the memory after each episode.

Thanks in advance.

arnaldocan
  • 407
  • 3
  • 8

1 Answers1

2

I believe this could be achieved by going one step back and restructuring and reshaping the data you feed the RNN model itself. At risk of sounding verbose, I offer the following explanation:

You should have an X and a y. I would suggest that both of these as 3D NumPy arrays, where

  • array[i] accesses a particular sequence i
  • array[i][j] accesses a particular time step j for a particular sequence i
  • array[i][j][k] accesses a specific feature k at a particular time step j for a particular sequence i (with the caveat that the length of k for y would be equal to 1, since we are only predicting for one target per time step)

So assuming you have 8 sequences, 3 time steps and 5 features

shape(X)
# (8, 3, 5) 

shape(y)
# (8, 3, 1)

Now assuming you've structured your data that way, all you would have to do is make sure that the X and y training instances match each other in the way you desire. To use your annotation:

print(X[0][0])
# [x1, x2, x3]

print(y[0][0])
# [y3]

print(X[1][0])
# [x4, x5, x6]

print(y[1][0])
# [y6]

Now say you already have this (sequence, time step, feature) 3D NumPy Array structure for the data you are feeding into your model. Just delete the training instances you do not want to from both your X and y.

  • [x1, x2, x3], [y3]
  • [x2, x3, x4], [y4]
  • [x3, x4, x5], [y5]
  • [x4, x5, x6], [y6]
mberrett
  • 92
  • 6
  • 1
    Thanks, I went for this approach and have trained the network some time ago. Now I'm just unsure if the memory is reset between episodes, or if there is some residue in the memory. – arnaldocan Jun 03 '19 at 18:46