I'm new to Keras, and I find it hard to understand the shape of input data of the LSTM layer.The Keras Documentation says that the input data should be 3D tensor with shape (nb_samples, timesteps, input_dim). I'm having trouble of understanding this format. Does the timesteps variable represent the number of timesteps the network remembers?
In my data a few time steps affect the output of the network but I do not know how many in advance i.e. I can't say that the previous 10 samples affect the output. For example the input can be words that form sentences. There is an important correlation between the words in each sentence. I don't know the length of the sentence in advance, this length also vary from one sentence to another. I do know when the sentence ends (i.e. i have a period that indicates the ending). Two different sentences has no affect one on the other - there is no need to remember the previous sentence.
I'm using the LSTM network for learning a policy in reinforcement learning, so I don't have a fixed data set. The agent's policy will change the length of the sentence.
How should I shape my data? How should it be fed into the Keras LSTM layer?