0

I've consulted the answer here: Why does my training loss have regular spikes? about spikes in the training loss. However, I am not using batches (i.e. batch_size=LARGE_NUMBER).

My model is as follows and I'm using a batchsize of 1_000_000 and a data size of 100_000:

    norm = preprocessing.Normalization()
    norm.adapt(data)

    model = keras.Sequential([
      norm,
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(100, activation='tanh', kernel_regularizer=regularizers.l2(1e-5)),
      layers.Dense(1)
    ])

    
    lr_schedule = tf.keras.optimizers.schedules.InverseTimeDecay(
      0.01,
      decay_steps=2000,
      decay_rate=1,
      staircase=False)

    optimizer=tf.keras.optimizers.Adam(lr_schedule)
    model.compile(loss='huber',
                optimizer=optimizer, metrics=['mean_absolute_error'])
    history = model.fit(
        train[0], train[1], validation_data=test, batch_size=1_000_000,
        verbose=2, epochs=epochs)

loss

Looking at the regression fits, I can see that the spikes are physical in that they correspond to obviously bad fits.

Lucidnonsense
  • 1,195
  • 3
  • 13
  • 35
  • You didn't include the training step, what makes you think batches are not being used? It is kind of the only way to train a model in Keras. – Dr. Snoopy Apr 23 '21 at 14:39
  • added the `.fit`. If I specify batchsize larger than the data size then that must mean there is only one "batch" per epoch (i.e. the entire dataset is used for each update). Right? Indeed, the progress output shows that there is only one "batch" per epoch. – Lucidnonsense Apr 23 '21 at 14:58
  • The X axis seems to show steps, so it likely that some steps (a single batch) have higher loss. – Dr. Snoopy Apr 23 '21 at 15:06

0 Answers0