Memory leak issue in tensorflow

Question

I have a memory leak with TensorFlow 1.14. I referred to various GitHub issues and Memory leak with TensorFlow to address my issue, and I followed the advice of the answer, that seemed to have solved the problem. However it does not work here. I even ported the code to Tensorflow 2.1 and 2.3 but still could not solve the problem.

Whenever I load the model then memory leak is there. I tried to clear the session after the model is loaded and used garbage collect API also but leak still persists.

In order to recreate the memory leak, I have created a simple example. I have used below function to check the memory used of the python process.

def memory_usage_func():
    import os
    import psutil
    process = psutil.Process(os.getpid())
    mem_used = process.memory_info()[0] >> 20
    print("Memory used:", mem_used)
    return mem_used

Below is the function to load the model and check memory usage:

for i in range(100):
    model = load_model('./model_example.h5', compile=False)
    del model
    memory_usage_func()

In above code, memory leak issue persists. Further, I tried to do prediction. For that, I created a session, load the model and run predict(). There also I face same memory leak issues. I used tf.keras.backend.clear_session() and gc.collect() after model is loaded. But, it is unable to clear the session and free the memory.

Hi, were you able to find a solution to this? I am facing same issue. — Resham Wadhwa, Jan 14 '21 at 11:40

score 1 · Answer 1 · answered Nov 29 '21 at 17:44

I ran into similar issue when I tried to use pre-trained embedding model to generate embedding as input feature set. While using universal-sentence-encoder-4, memory used for generate embedding is not released. Neiether tf.keras.backend.clear_session() nor gc.collect() helped.

I ended up using multiprocessing module to start a 'subprocess' to repeat

do something with multiprocess
kill process (memory is released)

import multiprocessing as mp
import time

def get_embedding(model_dir, text, q):
    print("      loading encoder")
    encoder = hub.load(model_dir)
    print("      generate embedding")
    encoded = encoder(text).numpy()
    print("      returning embedding")
    q.put(encoded)


manual_wait_time = 20

## for iterate
q = mp.Queue()
p = mp.Process(target=get_embedding, 
args=(encoder_dir, some_text, q, ))
p.start()
## p.join() ## hangs forever see reference
time.sleep(manual_wait_time)
print("manually wait %s seconds"%manual_wait_time)
embedded_mini_batch = q.get()
q.close()
p.terminate()

references:

Memory leak issue in tensorflow

1 Answers1

Linked