I would like to develop a time series approach for binary classification, with stateful LSTM in Keras
Here is how my data look. I got a lot , say N
, recordings. Each recording consists in 22 time series of length M_i(i=1,...N)
. I want to use a stateful model in Keras but I don't know how to reshape my data, especially about how I should define my batch_size
.
Here is how I proceeded for stateless
LSTM. I created sequences of length look_back
for all the recordings so that I had data of size (N*(M_i-look_back), look_back, 22=n_features)
Here is the function I used for that purpose :
def create_dataset(feat,targ, look_back=1):
dataX, dataY = [], []
# print (len(targ)-look_back-1)
for i in range(len(targ)-look_back):
a = feat[i:(i+look_back), :]
dataX.append(a)
dataY.append(targ[i + look_back-1])
return np.array(dataX), np.array(dataY)
where feat
is the 2-D data array of size (n_samples, n_features)
(for each recording) and targ
is the target vector.
So, my question is, based on the data explained above, how to reshape the data for a stateful model and take into account the batch notion ? Are there precautions to take ?
What I want to do is being able to classify each time_step of each recording as seizure/not seizure.
EDIT : Another problem I thought about is : I have recordings that contain sequences of different lenghts. My stateful model could learn long_term dependencies on each of the recording, so that means batch_size differents from one recording to another... How to deal with that ? Won't it cause generalization trouble when tested on completely different sequences (test_set) ?
Thanks