I'm trying to train a very basic, small LSTM model in Tensorflow on a GTX 1080 (although I've tried other cards, too). Depending on some parameters (like the hidden state size), I get a ResourceExhaustedError
after a pretty regular number of iterations.
The model isn't much more than a an embedding matrix (ca. 5000*300) and a single-layer LSTM, with a final dense projection at every timestep. I have tried with batch sizes as small as 1 and a hidden state size of just 20, but still I run out of memory on an 8G card, with a total training data size of 5M.
I can't wrap my head around why this is happening. I've obviously tried other suggestions to related problems discussed on Stackoverflow, incl. reducing the per_process_gpu_memory_fraction
in the TF GPU options, but to no avail.
See code here:
- https://pastebin.com/1MSUrt15 (main training script)
- https://pastebin.com/U1tYbM8A (model definition)
[This doesn't include some utility scripts, so won't run alone. I also deleted some functions for the sake of shortness. The code is designed for multi-task learning, which introduces some overhead here, but the memory problems persist in single-task setups.]
PS: one thing that I know I'm doing not 100% efficiently is storing all training data as a numpy
array, then sampling from there and using TF's feed_dict
to provide the data to my model. This, I believe, can slow down computation to some degree, but shouldn't cause such severe memory issues, right?