5
bce = tf.keras.losses.BinaryCrossentropy()
ll=bce(y_test[0], model.predict(X_test[0].reshape(1,-1)))
print(ll)
<tf.Tensor: shape=(), dtype=float32, numpy=0.04165391>
print(model.input)
<tf.Tensor 'dense_1_input:0' shape=(None, 195) dtype=float32>
model.output
<tf.Tensor 'dense_3/Sigmoid:0' shape=(None, 1) dtype=float32>
grads=K.gradients(ll, model.input)[0]
print(grads)
None

So here i have Trained a 2 hidden layer neural network, input has 195 features and output is 1 size. I wanted to feed the neural network with validation instances named as X_test one by one with their correct labels in y_test and for each instance calculate the gradients of the output with respect to input, the grads upon printing gives me a None. Your help is appreciated.

  • Does this answer your question? [How to obtain the gradients in keras?](https://stackoverflow.com/questions/51140950/how-to-obtain-the-gradients-in-keras) – Yoskutik May 23 '20 at 08:41

1 Answers1

5

One can do this using tf.GradientTape. I wrote the following code to learn a sin wave, and get its derivative in the spirit of this question. I think, it should be possible to extend the following codes in order to compute partial derivatives.
Importing the needed libraries:

import numpy as np
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import losses
import tensorflow as tf

Create the data:

x = np.linspace(0, 6*np.pi, 2000)
y = np.sin(x)

Defining a Keras NN:

def model_gen(Input_shape):
    X_input = Input(shape=Input_shape)
    X = Dense(units=64, activation='sigmoid')(X_input)
    X = Dense(units=64, activation='sigmoid')(X)
    X = Dense(units=1)(X)
    model = Model(inputs=X_input, outputs=X)
    return model

Training the model:

model = model_gen(Input_shape=(1,))
opt = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, decay=0.001)
model.compile(loss=losses.mean_squared_error, optimizer=opt)
model.fit(x,y, epochs=200)

To obtain the gradient of the network w.r.t. the input:

x = list(x)
x = tf.constant(x)
with tf.GradientTape() as t:
  t.watch(x)
  y = model(x)

dy_dx = t.gradient(y, x)

dy_dx.numpy()

One can further visualise dy_dx to make sure of how smooth the derivative is. Finally, note that one get a smoother derivative when one uses a smooth activation (e.g. sigmoid) instead of Relu as noted here.

Saleh
  • 169
  • 1
  • 6
  • Hi, I tried exactly what was written using Tf1 and 2. However, I got errors. For Tf1, my error is: ValueError: Input 0 of layer dense_18 is incompatible with the layer: : expected min_ndim=2, found ndim=1. Full shape received: [2000] . In TF2, my error is: InvalidArgumentError: In[0] is not a matrix. Instead it has shape [2000] [Op:MatMul]. Btw, the error occured at y = model(x). Can someone explain why and what's the solution? Thanks. – quarkz Sep 22 '21 at 14:35