1

I want to create a batch generator for training purposes - unfortunately I don't know how to make it so that the generator proceeds to the next time steps after it has finished a batch. That is, if the batch has processed say [0 1 2 3 4], it then has to process the next [5 6 7 8 9] of the whole say [0 1 2 ... 100] training set.

I also want that one epoch is one pass through the entire batch - so the batch generator has to go back at the beginning of the training set.

From keras.io, I have read that the if default epoch_steps=None, then "is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined".

def batch_generator(batch_size, sequence_length):
    """
    Generator function for creating batches of training-data.
    """

    # Infinite loop.""
    while True:
        # Allocate a new array for the batch of input-signals.
        x_shape = (batch_size, sequence_length, num_x_signals)
        x_batch = np.zeros(shape=x_shape, dtype=np.float16)

        # Allocate a new array for the batch of output-signals.
        y_shape = (batch_size, sequence_length, num_y_signals)
        y_batch = np.zeros(shape=y_shape, dtype=np.float16)

        # Fill the batch with random sequences of data.
        for i in range(batch_size):

            # Copy the sequences of data starting at this index.
            x_batch[i] = x_train_scaled[:sequence_length]
            y_batch[i] = y_train_scaled[:sequence_length]

        x_batch_1 = x_batch[ :, :, 0:5]
        x_batch_2 = x_batch[ :, :, 5:12]
        yield ([x_batch_1, x_batch_2], y_batch)

batch_size = 32
sequence_length = 24 * 7 

generator = batch_generator(batch_size=batch_size,
                            sequence_length=sequence_length)
%%time
model.fit_generator(generator=generator,
                    epochs=10,
                    steps_per_epoch=None,
                    validation_data=validation_data,
                    callbacks=callbacks)

1 Answers1

0

If you can use tensorflow, batch and repeat does what you are describing.

Otherwise, Keras - How are batches and epochs used in fit_generator()? looks relevant.

bantmen
  • 748
  • 6
  • 17
  • I am not familiar with tensorflow, and I don't understand the code for the second link. What I'm thinking is to have a start index to be passed to the batch assignement that gets updated every time a batch is passed (so basically within the generator index is updated by a sequence length, after a batch expected index update by sequence x batch_size) but I dont know how. – Joichiro Nishi Oct 26 '19 at 20:21