2

I'm training a series of models in a for loop - to test a certain architecture. While doing so, I run out of memory and the system shuts down the process.

The same problem appears in this question and this question. To try their solutions, I did a test run with a similar loop to the one that is giving me problems. The code is:

def mem_test(n):
    train_data = np.random.rand(1000,1500)
    train_labels = np.random.randint(2,size= 1000)
    mem = []
    for i in range(n):
        model = keras.Sequential([keras.layers.Dense(1000, activation= tf.nn.relu), 
                          keras.layers.Dense(2,activation = tf.nn.softmax)])
        model.compile(optimizer= tf.train.AdamOptimizer(.001), loss = 'sparse_categorical_crossentropy', 
                      metrics = ['accuracy'])
        model.fit(train_data,train_labels, epochs = 1)
        mem.append(psutil.virtual_memory())
    return mem


def mem_test_clear(n):
    train_data = np.random.rand(1000,1500)
    train_labels = np.random.randint(2,size= 1000)
    mem = []
    for i in range(n):
        model = keras.Sequential([keras.layers.Dense(1000, activation= tf.nn.relu), 
                          keras.layers.Dense(2,activation = tf.nn.softmax)])
        model.compile(optimizer= tf.train.AdamOptimizer(.001), loss = 'sparse_categorical_crossentropy', 
                      metrics = ['accuracy'])
        model.fit(train_data,train_labels, epochs = 1)
        mem.append(psutil.virtual_memory())
        keras.backend.clear_session()
        tf.reset_default_graph()
    return mem

while the latter seems to do slightly better than the former, they both still end up accumulating memory usage. So, for my actual application of this, I'm left without a solution. What do I need to do in order to actually free up memory in this situation? What am I doing wrong?

msm
  • 235
  • 1
  • 9

1 Answers1

1

You have to compile only once the model. Then you can build a loop for fitting it:

import numpy as np
import psutil
import keras
import tensorflow as tf

def mem_test(n):
    train_data = np.random.rand(1000,1500)
    train_labels = np.random.randint(2,size= 1000)
    mem = []

    model = keras.Sequential([keras.layers.Dense(1000, activation= tf.nn.relu), 
                  keras.layers.Dense(2,activation = tf.nn.softmax)])
    model.compile(optimizer= tf.train.AdamOptimizer(.001), loss = 'sparse_categorical_crossentropy', 
                  metrics = ['accuracy'])

    for i in range(n):
        model.fit(train_data,train_labels, epochs = 1)
        mem.append(psutil.virtual_memory())
    return mem

mem_test(50)

This way it will consume just a tiny amount of memory and will not accumulate anything. Furthermore this is the way how your model will work correctly.

Geeocode
  • 5,705
  • 3
  • 20
  • 34
  • I think there might be slight miscommunication about what it is I am trying to do. My goal is not to train a SINGLE model. It is to create a new model each time and train that model. – msm Nov 03 '18 at 20:10
  • It's ok, but your model parameters and construction don't change at all. Thus you have to compile it only once out of the loop context. – Geeocode Nov 03 '18 at 20:15
  • When you train your model, you actually just fit and fit it again, that is the training process. The model construction and compilation should have been done before the fitting. – Geeocode Nov 03 '18 at 20:21
  • In the case when you run this function more - that's why was my first question - then you have to use python's del and garbage collect to free up memory from the model instance. – Geeocode Nov 03 '18 at 20:24