Why are there spikes in the training/test loss if I'm not using batch training?

Question

I've consulted the answer here: Why does my training loss have regular spikes? about spikes in the training loss. However, I am not using batches (i.e. batch_size=LARGE_NUMBER).

My model is as follows and I'm using a batchsize of 1_000_000 and a data size of 100_000:

    norm = preprocessing.Normalization()
    norm.adapt(data)

    model = keras.Sequential([
      norm,
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(1)
    ])

    
    lr_schedule = tf.keras.optimizers.schedules.InverseTimeDecay(
      0.01,
      decay_steps=2000,
      decay_rate=1,
      staircase=False)

    optimizer=tf.keras.optimizers.Adam(lr_schedule)
    model.compile(loss='huber',
                optimizer=optimizer, metrics=['mean_absolute_error'])
    history = model.fit(
        train[0], train[1], validation_data=test, batch_size=1_000_000,
        verbose=2, epochs=epochs)

Looking at the regression fits, I can see that the spikes are physical in that they correspond to obviously bad fits.

You didn't include the training step, what makes you think batches are not being used? It is kind of the only way to train a model in Keras. — Dr. Snoopy, Apr 23 '21 at 14:39
added the `.fit`. If I specify batchsize larger than the data size then that must mean there is only one "batch" per epoch (i.e. the entire dataset is used for each update). Right? Indeed, the progress output shows that there is only one "batch" per epoch. — Lucidnonsense, Apr 23 '21 at 14:58
The X axis seems to show steps, so it likely that some steps (a single batch) have higher loss. — Dr. Snoopy, Apr 23 '21 at 15:06

Why are there spikes in the training/test loss if I'm not using batch training?

0 Answers0