2

I want to make a RNN using a Keras sequential model with a tensorflow backend. When I implement the following code:

batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2])
print(batch_inputshape) #(8, 600, 103)
​
model = Sequential()
model.add(LSTM(103, 
               batch_input_shape = batch_inputshape, 
               return_sequences = True,
              stateful = True))
model.add(Dropout(0.2))
​
model.add(LSTM(50, 
               return_sequences = True,
              stateful = True))
model.add(Dropout(0.2))
​
​
model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= ncce, optimizer='adam')    ​
​
print (model.output_shape) #(8, 600, 2)

model.fit(x_train,y_train, batch_size = batch_size,
                           nb_epoch = 1, validation_split=0.25)

I get the follow error message:

Input to reshape is a tensor with 16 values, but the requested shape has 8

But whatever I change the batch_size to the error will just follow the following formula:

Input to reshape is a tensor with 2 * batch_size values, but the requested shape has batch_size

I have looked at other Q&A, but I do not think they help me much. Or I dont understand the answers well enough.

Any help would be much appreciated!

EDIT: as requested the shape of input and target:

print(x_train.shape) #(512,600,103)
print(y_train.shape) #(512,600,2)

EDIT 2:

from functools import partial
import keras.backend as K 
from itertools import product
​
def w_categorical_crossentropy(y_true, y_pred, weights):
    # https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
    nb_cl = len(weights)
    final_mask = K.zeros_like(y_pred[:, 0])
    y_pred_max = K.max(y_pred, axis=1)
    y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
    y_pred_max_mat = K.cast(K.equal(y_pred, y_pred_max), K.floatx())
    for c_p, c_t in product(range(nb_cl), range(nb_cl)):
        final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
    return K.categorical_crossentropy(y_pred, y_true) * final_mask
​
w_array = np.ones((2,2))
w_array[1, 0] = 100
​
​
print(w_array)
ncce = partial(w_categorical_crossentropy, weights=w_array)
ncce.__name__ ='w_categorical_crossentropy

EDIT 3: UPDATE

With help of @Nassim Ben, he figured out that the problem is in the loss function. He posted code with a regular loss function and then it works just fine. However with the custom loss function that code does not work. As any readers of this question can see I posted my costum loss function above and there is the problem. Currently I do not yet know why this error exist but this is the current status.

Community
  • 1
  • 1
NeoTT
  • 110
  • 9

1 Answers1

0

EDIT : This code works for me, I have only changed the loss for simplicity.

import keras
from keras.layers import *
from keras.models import Sequential
from keras.objectives import *
import numpy as np

x_train = np.random.random((512,600, 103))
y_train = np.random.random((512,600,2))
batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2]) 
print(batch_inputshape) #(8, 600, 103)

model = Sequential()
model.add(LSTM(103,
           batch_input_shape = batch_inputshape,
           return_sequences = True,
          stateful = True))
model.add(Dropout(0.2))
model.add(LSTM(50,
           return_sequences = True,
          stateful = True))
model.add(Dropout(0.2))


model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= "mse", optimizer='adam')

print (model.output_shape) #(8, 600, 2)

model.fit(x_train,y_train, batch_size = batch_size,
                       nb_epoch = 1, validation_split=0.25)

EDIT 2:

So the error was coming from the loss function. In the code you copied from github for ncce loss, they had outputs of shape (batch,10). You have outputs of shape (batch, 600, 2). So here is my edit of the function :

def w_categorical_crossentropy(y_true, y_pred, weights):
# https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
    nb_cl = len(weights)
    # Create a mask with zeroes
    final_mask = K.zeros_like(y_pred[:,:,0])
    # get the maximum probability value for every output (shape = (batch,600,1))
    y_pred_max = K.max(y_pred, axis=2, keepdims=True)
    # Get the actual predictions for every output (shape = (batch,600,2))
    # This K.equal uses broadcasting, we compare two tensors of different sizes but it works (magic)
    y_pred_max_mat = K.equal(y_pred, y_pred_max)
    for c_p, c_t in product(range(nb_cl), range(nb_cl)):
        # Create the mask of weights to apply to the result of the cat_crossentropy
        final_mask += (weights[c_t, c_p] * K.cast(y_pred_max_mat[:,:, c_p], K.floatx()) * y_true[:,:, c_t])
    return K.categorical_crossentropy(y_pred, y_true) * final_mask

w_array = np.ones((2,2))
w_array[1, 0] = 100

As you can see, I just modified the index play because of your particular shape. The mask has to be of shape (batch, 600). The max has to be done on the 3rd dimension because there lie the probabilities that you want to output. The matrix multiplication to build the max needed to be updated too because of the shape of your tensors again.

This should work.

If you need more detailed explaination feel free to ask :-)

Nassim Ben
  • 11,473
  • 1
  • 34
  • 52
  • x_train.shape = (512,600,103) and, y_train.shape = (512,600,2) I do not understand your suggested solution. Because in a sequential model your x should be a tensor with 3 dimensions, right? like: (samples,sequence,features) – NeoTT Feb 08 '17 at 14:46
  • well, your code is running on my machine, what is your config? Keras version, backend ? And you are right, I have answered too quickly. Your code is fine on my machine – Nassim Ben Feb 08 '17 at 15:00
  • Where do i find these version numbers? – NeoTT Feb 08 '17 at 15:05
  • print(keras.__version__) what is the backend you are using? do the same for the backend please – Nassim Ben Feb 08 '17 at 15:09
  • print(keras.__version__) print(tensorflow.__version__) 1.2.1 0.12.1 – NeoTT Feb 08 '17 at 15:17
  • Odd, I am using the same... can you copy paste my code and test it ? – Nassim Ben Feb 08 '17 at 15:18
  • So i now just use np.zeros() with these shapes (because there is much more then 512 and it takes forever to load) but it has the same error. certainly I would like to try your code – NeoTT Feb 08 '17 at 15:22
  • Look at the post above, i edited my bs answer to put the code that is working for me. – Nassim Ben Feb 08 '17 at 15:23
  • This also works just fine for me... I will try it now with my loss function – NeoTT Feb 08 '17 at 15:27
  • The problem is in the loss function. The code you provided also does not work with my costum loss function. I will post the loss function in my original question. – NeoTT Feb 08 '17 at 15:30
  • As you can see I got this one from github. The link is in the code. Stupid of me not to check that part. Do you have any idea how to make it work? – NeoTT Feb 08 '17 at 15:46
  • Thanks for your help. This would have taken me weeks. I really have to go now. I will edit this whole question again later to make sense for people rereading this. If you have any suggestion to fix the loss function much appreciated but this is a whole new question of course. And I will first go try to find the answer myself before I post something. – NeoTT Feb 08 '17 at 16:08
  • I'm looking for an anwser :) Not sure how to modify that loss function but I'll try to – Nassim Ben Feb 08 '17 at 16:08