4

I am using a customised batch generator in an attempt to fix the problem of incompatible shapes (BroadcastGradientArgs error) while using the standard model.fit() function due to the small size of the last batch in the training data. I used the batch generator mentioned here with the model.fit_generator() function:

class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.floor(self.x.shape[0] / self.batch_size) 

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size] #Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)

But it seems that it discards the last batch if its size is smaller than the provided batch size. How can I update it to include the last batch and expand it (for example) with some repeated samples?

Also, somehow I don't get how "Line A" works!

Update: here is how I am using the generator in with my model:

# dummy model
input_1 = Input(shape=(None,))
...
dense_1 = Dense(10, activation='relu')(input_1)
output_1 = Dense(1, activation='sigmoid')(dense_1)

model = Model(input_1, output_1)
print(model.summary())

#Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

train_data_gen = Generator(x1_train, y_train, batch_size)
test_data_gen = Generator(x1_test, y_test, batch_size)

model.fit_generator(generator=train_data_gen, validation_data = test_data_gen, epochs=epochs, shuffle=False, verbose=1)

 loss, accuracy = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f Accuracy: %0.5f' % (loss, accuracy))
Daisy
  • 847
  • 3
  • 13
  • 34

2 Answers2

8

I thing the culprit is this line

    return math.floor(self.x.shape[0] / self.batch_size)

Replace it with this might work

    return math.ceil(self.x.shape[0] / self.batch_size) 

Imagine if you have 100 samples and batch size 32. It should divided to 3.125 batches. But if you use math.floor, it will become 3 and discord 0.125.

As for Line A, if batch size is 32, when index is 1 the [idx * self.batch_size:(idx + 1) * self.batch_size] will become [32:64], in other word, pick the 33th to 64th elements of self.indices

**Update 2, change the input to have a None shape and use LSTM and add evaluate

import os
os.environ['CUDA_VISIBLE_DEVICES'] = ""
import math
import numpy as np
from keras.models import Model
from keras.utils import Sequence
from keras.layers import Input, Dense, LSTM


class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.ceil(self.x.shape[0] / self.batch_size)

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]  # Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)


# dummy model
input_1 = Input(shape=(None, 10))
x = LSTM(90)(input_1)
x = Dense(10)(x)
x = Dense(1, activation='sigmoid')(x)

model = Model(input_1, x)
print(model.summary())

# Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

x1_train = np.random.rand(1590, 20, 10)
x1_test = np.random.rand(90, 20, 10)
y_train = np.random.rand(1590, 1)
y_test = np.random.rand(90, 1)

train_data_gen = Generator(x1_train, y_train, 256)
test_data_gen = Generator(x1_test, y_test, 256)

model.fit_generator(generator=train_data_gen,
                    validation_data=test_data_gen,
                    epochs=5,
                    shuffle=False,
                    verbose=1)

loss = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f' % loss)

This run without any problem.

Natthaphon Hongcharoen
  • 2,244
  • 1
  • 9
  • 23
  • but it will give the same Incompatible error of shapes with prime numbers. – Daisy Apr 28 '19 at 12:13
  • For example? It should not be any problem even if the shape is 7 and batch size is 32 though? – Natthaphon Hongcharoen Apr 28 '19 at 12:19
  • suppose that the length of my training examples is 1590 and my batch size is 32 (i know that batch of 30 would work well but let us keep it like that for the sake of the current example) Then the last batch will be of size 22. How to handle these 22 samples other than discarding them? – Daisy Apr 28 '19 at 14:23
  • You don't need to do anything, 1590/32 is 49.6875 so you can just use 50 step, something like `x[1568:1600]` won't error even `x` has only 1590 elements. – Natthaphon Hongcharoen Apr 28 '19 at 14:28
  • Which why I suggest you to use `math.ceil` instead of `math.floor`, it will add more batch if there is at least 1 sample left. – Natthaphon Hongcharoen Apr 28 '19 at 14:29
  • I get what you mean, but using ceil throws the following error: `tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [32,1] vs. [22,1] [[{{node training/Adam/gradients/loss/dense_1_loss/logistic_loss/mul_grad/BroadcastGradientArgs}}]]` – Daisy Apr 28 '19 at 14:37
  • Hmm, is there something to do with `sparse_categorical_crossentropy`?. Also could you provide the project link? Since I always use this method but never met this problem. – Natthaphon Hongcharoen Apr 28 '19 at 14:44
  • Is it the same as this problem https://github.com/keras-team/keras/issues/11749 ? Have you tried downgrade the TF and Keras? – Natthaphon Hongcharoen Apr 28 '19 at 14:46
  • I am not using `sparse_categorical_crossentropy`. I searched a lot about that problem but with no luck. I updated the question with how I am using the generator the model code. – Daisy Apr 28 '19 at 15:16
  • 2
    It run without problem in my machine by just change the input shape to a fixed number, since Dense layer need it to be specific. I update the code I use in the main answer. – Natthaphon Hongcharoen Apr 28 '19 at 15:28
0

Away from the strategy in the other answers, such issue could be tackled in different ways, depending on your scope (intention).

If you wish to repeat some samples in the last batch (until the last batch's size is equal to batch_size) as you suggested in your question, you could (for example) check whether the last sample in the dataset was reached, if so, do something. e.g.:

N_batches = int(np.ceil(len(dataset) / batch_size))
batch_size = 32
batch_counter = 0
while True:
  current_batch = []
  idx_start = batch_size * batches_counter
  idx_end = batch_size * (batches_counter + 1)
  for idx in range(idx_start, idx_end):
      ## Next line sets idx to the index of the last sample in the dataset:
      idx = len(dataset)-1 if (idx > len(data_set)-1) else idx
      current_batch.append(dataset[idx])
      .
      .
      .
  batch_counter += 1
  if (batch_counter == N_batches):
     batch_counter = 0

Obviously, it doesn't need to be the last sample, it could (for example) be a random sample from the dataset:

idx = random.randint(0,len(dataset) if (idx > len(data_set)-1) else idx

Hope this helps.