Keras: LSTM with class weights

Question

my question is quite closely related to this question but also goes beyond it.

I am trying to implement the following LSTM in Keras where

the number of timesteps be nb_tsteps=10
the number of input features is nb_feat=40
the number of LSTM cells at each time step is 120
the LSTM layer is followed by TimeDistributedDense layers

From the question referenced above I understand that I have to present the input data as

nb_samples, 10, 40

where I get nb_samples by rolling a window of length nb_tsteps=10 across the original timeseries of shape (5932720, 40). The code is hence

model = Sequential()
model.add(LSTM(120, input_shape=(X_train.shape[1], X_train.shape[2]), 
  return_sequences=True, consume_less='gpu'))
model.add(TimeDistributed(Dense(50, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(20, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(3, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))

Now to my question (assuming the above is correct so far): The binary responses (0/1) are heavily imbalanced and I need to pass a class_weight dictionary like cw = {0: 1, 1: 25} to model.fit(). However I get an exception class_weight not supported for 3+ dimensional targets. This is because I present the response data as (nb_samples, 1, 1). If I reshape it into a 2D array (nb_samples, 1) I get the exception Error when checking model target: expected timedistributed_5 to have 3 dimensions, but got array with shape (5932720, 1).

Thanks a lot for any help!

score 7 · Accepted Answer · answered Aug 12 '16 at 08:53

I think you should use sample_weight with sample_weight_mode='temporal'.

From the Keras docs:

sample_weight: Numpy array of weights for the training samples, used for scaling the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode="temporal" in compile().

In your case you would need to supply a 2D array with the same shape as your labels.

I do not understand why sample_weight (with sample_weight_mode="temporal") is relevant. We are talking about class weights here, not sample weights, right? sample_weight_mode="temporal" just assigns a weight to every sample in every time step, but it has nothing to do with class labels at all? — ymeng, Nov 09 '17 at 22:28

score 1 · Answer 2 · answered Apr 02 '19 at 17:43

If this is still an issue.. I think the TimeDistributed Layer expects and returns a 3D array (kind of similar to if you have return_sequences=True in the regular LSTM layer). Try adding a Flatten() layer or another LSTM layer at the end before the prediction layer.

d = TimeDistributed(Dense(10))(input_from_previous_layer)
lstm_out = Bidirectional(LSTM(10))(d)
output = Dense(1, activation='sigmoid')(lstm_out)

score 0 · Answer 3 · answered Apr 30 '20 at 17:25

0

Using temporal is a workaround. Check out this stack. The issue is also documented on github.

answered Apr 30 '20 at 17:25

chiceman

86
3

Keras: LSTM with class weights

3 Answers3