I have created a Keras LSTM model that does sequence classification. I have 27 sequences in the Training set and 18 sequences in the Test set. Each sequence has 4000 time-steps that I have achieved by padding with zeroes. Each sequence is a combination of 2499 parallel series. This means I have 2499 Features.
- Dimensions of X_Train is (27 x 4000 x 2499) :27-Sequences, 4000 Timesteps in each sequence and 2499 features.
- Dimension of Y_Train is (27 x 4000 x 1)
- Dimension of X_Test is (18 x 4000 x 2499)
- Dimension of Y_Test is (18 x 4000 x 1)
I am using Bi-directional LSTM Model with return_sequences set to True
My ultimate goal is to get Feature Importances using ELI5 Package's Permutation Importance.
Since ELI5 package does not support Keras framework, I want to use a Scikit learn Wrapper around keras to get it to behave like scikit learn.
Then I can finally use ELI5 package on my model to get Important features.
I have used these parameters:
- layer1_units = 40
- layer1_act = 'tanh'
- go_backwards = False
- return_sequences = True
- merge_mode = 'concat'
- lr = 0.01
- epochs = 2
- batch_size = 200
I am going to use this lstm model in the build_fn attribute of KerasClassifier().
model = KerasClassifier(build_fn= lstm_Trial.model(), epochs=3, batch_size=40, verbose=1)
Then I am trying to use .fit() method.
model.fit(x = X_Train, y = Y_Train_Ori)
This throws an error.
ValueError Traceback (most recent call last) in () ----> 1 model.fit(x = X_Train, y = Y_Train_Ori) ~/anaconda3/lib/python3.6/site-packages/keras/wrappers/scikit_learn.py in fit(self, x, y, sample_weight, **kwargs) 203 y = np.searchsorted(self.classes_, y) 204 else: --> 205 raise ValueError('Invalid shape for y: ' + str(y.shape)) 206 self.n_classes_ = len(self.classes_) 207 if sample_weight is not None:
ValueError: Invalid shape for y: (27, 4000, 1)
How do I use KerasClassifier properly so that Ultimately I am able to use ELI5 package?