1

I'm attempting to wrap my Keras neural network in a class object. I have implemented the below outside of a class setting, but I want to make this more object-friendly. To summarize, my model calls function sequential_model which creates a sequential model. Within the compile step, I have defined my own loss function weighted_categorical_crossentropy which I want the sequential model to implement. However, when I run the code below I get the following error: ValueError: No gradients provided for any variable:

I suspect the issue is with how I'm defining the weighted_categorical_crossentropy function for its use by sequential.

Again, I was able to get this work in a non-object oriented way. Any help will be much appreciated.

from tensorflow.keras import Sequential, backend as K

class MyNetwork(): 
        
    def __init__(self, file, n_output=4, n_hidden=20, epochs=3,
                 dropout=0.10, batch_size=64, metrics = ['categorical_accuracy'],
                 optimizer = 'rmsprop', activation = 'softmax'):

    [...] //Other Class attributes
 
    def model(self):
        self.model = self.sequential_model(False)
        self.model.summary()


    def sequential_model(self, val):
        K.clear_session()
        if val == False:
            self.epochs = 3
        regressor = Sequential()
        #regressor.run_eagerly = True
        regressor.add(LSTM(units = self.n_hidden, dropout=self.dropout, return_sequences = True, input_shape = (self.X.shape[1], self.X.shape[2])))
        regressor.add(LSTM(units = self.n_hidden, dropout=self.dropout, return_sequences = True))
        regressor.add(Dense(units = self.n_output, activation=self.activation))
    
        self.weights = np.array([0.025,0.225,0.78,0.020])

        regressor.compile(optimizer = self.optimizer, loss = self.weighted_categorical_crossentropy(self.weights), metrics = [self.metrics])
        regressor.fit(self.X, self.Y*1.0,batch_size=self.batch_size, epochs=self.epochs, verbose=1, validation_data=(self.Xval, self.Yval*1.0))

        return regressor

    def weighted_categorical_crossentropy(self, weights):
        weights = K.variable(weights)
        def loss(y_true, y_pred):
            y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
            y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
            loss = y_true * K.log(y_pred) * weights
            loss = -K.sum(loss, -1)
            return loss
Josh
  • 3,231
  • 8
  • 37
  • 58

1 Answers1

2

There are several problems with above code, but the most noticeable one is you don't return the loss from weighted_categorical_crossentropy. It should look more like:

    def weighted_categorical_crossentropy(self, weights):
        weights = K.variable(weights)
        def loss(y_true, y_pred):
            y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
            y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
            loss = y_true * K.log(y_pred) * weights
            loss = -K.sum(loss, -1)
            return loss
        return loss # Return the callable function!

The error is ValueError: No gradients provided for any variable because the loss method doesn't return anything, it returns None! If you try to fit a method with loss=None, the model will have no way of computing gradients and therefore it will throw the same exact error.

Next up is the that you are using return_sequences = True in the layer right before a non-recurrent layer. This causes the Dense layer to be called on mis-shaped data, that's appropriate only for recurrent layers. Don't use it like that.
If you have a good reason for using the return_sequences = True, then you must add Dense layer like:

model.add(keras.layers.TimeDistributed(keras.layers.Dense(...)))

This will cause the Dense layer to act on output sequence on every time step separately. This also means that your y_true must be of appropriate shape.

There could be other problems with the custom loss function that you defined, but I can not deduce the input/output shapes, so you will have to run it and add see if it works. There will probably be matrix multiplication shape mismatch.

Last but not least, think about using the Sub-classing API. Could it make any of your operations easier to write?

Thanks for reading and I'll update this answer once I have that info. Cheers.

tornikeo
  • 915
  • 5
  • 20
  • 1
    Thank you for pointing out that obvious logic error and appreciate your thorough response! Re your second point, my output of `y_true` is a `n X time X classes` matrix (in this case: `n X 20 X 4`), thus why the `n_hidden` is of size 20 and `n_output` is of 4 distinct classes. The hidden size of 20 means 20 sequential period of modeling (in order per every n), and so from your point above, it looks like I should have define the `dense` layer as you mentioned. Re your third point, will have to look into a `sub-classing API`. Thanks again, would treat you to some good kinkhali if possible. – Josh Sep 07 '20 at 13:41
  • 1
    My pleasure @Josh. Glad to help. Also, about that knikali: If you somehow decide to visit Georgia one day, I'd be more than happy to invite you at local sakhinkle (name of a place where it is served). Cheers. :) – tornikeo Sep 07 '20 at 14:59
  • 1
    It's only a matter of time, my wife is from Poti and I've been instructed to visit Bakhmaro as soon as possible :D – Josh Sep 07 '20 at 15:28
  • hey @tornikeo, hope you're well. You're kind of the expert I know in this field, would truly appreciate if you could take a look at another one of my ANN questions: https://stackoverflow.com/questions/65067446/loss-function-not-improving-with-epochs-custom-loss-function – Josh Nov 30 '20 at 02:39
  • 1
    Hey @Josh I'm fine. How are you? I have some errands to run today and as soon as I get some free time on my hands, I'll try to answer your question. I already bookmarked it. – tornikeo Nov 30 '20 at 11:40
  • Anyway I can get your email (or dummy email), I'm preparing a google colab and I don't want to take the chance of providing public access to my drive (trying to make as private as I can). If not, I can try to find a work-around. Thanks. – Josh Dec 01 '20 at 17:42
  • Sure. Email me at tonop15@freeuni.edu.ge - that's an old university email address of mine. Cheers. – tornikeo Dec 02 '20 at 12:52
  • Thanks! Please look an email with subject "Stackoverflow: Josh Query". I just sent it with the colab link, etc. – Josh Dec 02 '20 at 13:45