1

In an LSTM network I'm passing as feature an array of the form

X.shape 
    (350000, 240, 1)

With a binary categorical target of the form

y.shape 
    (350000, 2)

How can I estimate the optimal batch size to minimize learning time without losing accuracy?

Here's the setup:

model = Sequential()
model.add(LSTM(25, input_shape=(240, 1)))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss="binary_crossentropy", optimizer="rmsprop")
model.fit(X_s, y_s, epochs=1000, batch_size=512, verbose=1, shuffle=False, callbacks=[EarlyStopping(patience=10)])
NicolasVega
  • 61
  • 1
  • 6

1 Answers1

1

Unfortunately, batch size is a hyper parameter you will have to learn by cross validation. In practice, start with a very large number (1024) and then half it until you see performance improve.

There are also a few papers that show that the optimal learning rate and batch size are roughly inversely correlated: https://arxiv.org/abs/1711.00489

Fede_v
  • 73
  • 4