Loss starts to jump around after few epochs

Question

I am trying to train a neural network to detect steganographic images. I used Nvidia Digits with Tensorflow. My problem is the loss starts to gradually decrease and then starts to jump around.

My Neural Network is -

from model import Tower
from utils import model_property
import tensorflow as tf
import tensorflow.contrib.slim as slim
import utils as digits

class UserModel(Tower):

    @model_property
    def inference(self):
        x = tf.reshape(self.x, shape=[-1, self.input_shape[0], self.input_shape[1], self.input_shape[2]])
        with slim.arg_scope([slim.conv2d, slim.fully_connected],
                            weights_initializer=tf.contrib.layers.xavier_initializer(),
                            weights_regularizer=slim.l2_regularizer(0.00001)):
            conv1 = tf.layers.conv2d(inputs=x, filters=64, kernel_size=7, padding='Valid', strides=2, activation=tf.nn.relu)
            rnorm1 = tf.nn.local_response_normalization(input=conv1)
            conv2 = tf.layers.conv2d(inputs=rnorm1, filters=16, kernel_size=5, padding='Valid', strides=1, activation=tf.nn.relu)
            rnorm2 = tf.nn.local_response_normalization(input=conv2) 
            flatten = tf.contrib.layers.flatten(rnorm2)
            fc1 = tf.contrib.layers.fully_connected(inputs=flatten, num_outputs=1000, activation_fn=tf.nn.relu)
            fc2 = tf.contrib.layers.fully_connected(inputs=fc1, num_outputs=1000, activation_fn=tf.nn.relu)
            fc3 = tf.contrib.layers.fully_connected(inputs=fc2, num_outputs=2, activation_fn=None)
            return fc3

    @model_property
    def loss(self):
        model = self.inference
        loss = digits.classification_loss(model, self.y)
        accuracy = digits.classification_accuracy(model, self.y)
        self.summaries.append(tf.summary.scalar(accuracy.op.name, accuracy))
        return loss

I am using SGD with 0.0005 base learning rate. I changed the step size to 5% with the gamma of 0.95. (I used these settings as I researched and learnt loss starts to jump around after a while when the learning rate isn't reducing fast enough - earlier I used 0.0005 with base rate and nvidia digits default step size).

Do you know how to make the loss gradually reduce? Any advice or guidance to make the network will be appreciated.

Thanks!

score 1 · Answer 1 · answered May 12 '18 at 12:47

So if anyone is having the same issue, what I did was adjust the initial loss to 0.0001, step size to 5% and gamma to 0.9. It gave me a mostly gradual reduction of loss.

But I am thinking the learning rate is too low as the loss doesn't go down as much as I would like it to.

Loss starts to jump around after few epochs

1 Answers1