Small LSTM model in keras does not fit my GPU

Question

I am programming a relatively small LSTM model in Google Collab.

For reference I am using TensorFlow 1.13 to build the model, using tensorflow.keras for the keras API.

seq_len = 20000; n_classes = 4
inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1000)(inputs)
x = ll.LSTM(units=100, activation='relu', return_sequences=True)(x)
outputs = ll.Dense(units = n_classes, activation='softmax')(x)
model = Model(inputs, outputs)
model.summary()

I have checked that I have 15 GB of GPU RAM available, and according to my estimations the model with a batch size of 32 should fit in 3GB of RAM.

However, whenever I launch the training the server runs out of memory.

To be fair, I am using extremely long sequences of data (20000 is the maximum sequence length) but I would expect the model to unroll symbolically in memory and just fit in.

Reducing the batch size to 1 does not help either.

What is going on? How can I make this model fit in memory?

EDIT: I tried reducing the sequence length to 2 and that indeed makes it fit in memory. But I need the sequence length to remain high. How can I tell Tensorflow to not unroll the network at any point? (I suspect that is what is going on behind the scenes, how can I check if this is indeed the case?)

EDIT: If I remove the Softmax layer then the memory use drops to the normal range again. I think that the Softmax layer is causing Tensorflow to unroll the network. TimeDistributing the Softmax does not help though.

please include the output of model.summary(), the number of parameters is probably huge. Experiment with different sequence lengths and see the change in number of parameters. — Dr. Snoopy, Apr 25 '19 at 16:53
@MatiasValdenegro the number of parameters is quite within the range of what I usually work with: `1,474,404` parameters — Jsevillamol, Apr 26 '19 at 09:01

score 1 · Answer 1 · answered Apr 26 '19 at 09:06

Changing the LSTM layer for the CuDNNLSTM layer did the trick!

inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1024)(inputs)
x = ll.CuDNNLSTM(units=100, return_sequences=True)(x)
x = ll.Dense(units = n_classes, activation='softmax')(x)
outputs = x
model = Model(inputs, outputs)

Small LSTM model in keras does not fit my GPU

1 Answers1