1

I am implementing a fully-connected model for classification using the MNIST dataset. A part of the code is the following:

model=tf.keras.models.Sequential([
tf.keras.layers.Input(shape=(28, 28, 1)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(
    loss='categorical_crossentropy',
    optimizer=tf.optimizers.SGD(),
    metrics=["accuracy"]
)

model.fit(
    x_train,
    y_train,
    batch_size=64,
    epochs=3,
    validation_data=(x_test, y_test)
)

Is there a way to print the max gradient for each layer for a given mini-batch?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Marios Gab
  • 13
  • 3

1 Answers1

0

Define a custom training loop instead of invoking compile() and fit().

optimizer=tf.keras.optimizers.Adam(0.001)
loss=tf.keras.losses.SparseCategoricalCrossentropy()

for x, y in zip(x_train, y_train):
    with tf.GradientTape() as tape:
        predictions = model(x)
        loss_value = loss(y, predictions)
    gradients = tape.gradient(loss_value, model.trainable_weights)
    grads_and_vars = zip(gradients, model.trainable_weights)
    optimizer.apply_gradients(grads_and_vars)
    for layer in range(0, 4): # for 4 layers
        print('max gradient of layer={}, kernel={}, bias={}'.format(
            layer, gradients[layer].numpy().max(), gradients[layer*2+1].numpy().max()))
        

Check this out : About Keras

alionkun
  • 167
  • 4