Strongly increasing memory consumption when using ELMo from Tensorflow-Hub

Question

I am currently trying to compare the similarity of millions of documents. For a first test on a CPU I reduced them to around 50 characters each and try to get the ELMo Embedding for 10 of them at a time like this:

ELMO = "https://tfhub.dev/google/elmo/2"
for row in file:
    split = row.split(";", 1)
    if len(split) > 1:
        text = split[1].replace("\n", "")
            texts.append(text[:50])
    if i == 300:
        break
    if i % 10 == 0:
        elmo = hub.Module(ELMO, trainable=False)
                 executable = elmo(
                 texts,
                 signature="default",
                 as_dict=True)["elmo"]

    vectors = execute(executable)
    texts = []
    i += 1

However, even with this small example, after around 300 sentences (and not even saving the vectors) the program consumes up to 12GB of RAM. Is this a know issue (the other issues I found suggest something similar, but not quite that extreme) or did I make a mistake?

You are passing in a variable `sentences` but we cannot see where this is defined — Stewart_R, Jun 07 '19 at 06:55
sorry, my bad. it should have been texts instead of sentences (in my code the elmo part is part of its own method, where the parameter is called senteces). I have edited it — Daniel Töws, Jun 07 '19 at 07:26

arnoegw · Accepted Answer · 2019-06-07T13:10:06.027

This is for TensorFlow 1.x without Eager mode, I suppose (or else the use of hub.Module would likely hit bigger problems).

In that programming model, you need to first express your computation in a TensorFlow graph, and then execute that graph repeatedly for each batch of data.

Constructing the module with hub.Module() and applying it to map an input tensor to an output tensor are both parts of graph building and should happen only once.
The loop over the input data should merely call session.run() to feed input and fetch output data from the fixed graph.

Fortunately, there is already a utility function to do all this for you:

import numpy as np
import tensorflow_hub as hub

# For demo use only. Extend to your actual I/O needs as you see fit.
inputs = (x for x in ["hello world", "quick brown fox"])

with hub.eval_function_for_module("https://tfhub.dev/google/elmo/2") as f:
  for pystr in inputs:
    batch_in = np.array([pystr])
    batch_out = f(batch_in)
    print(pystr, "--->", batch_out[0])

What this does for you in terms of raw TensorFlow is roughly this:

module = Module(ELMO_OR_WHATEVER)
tensor_in = tf.placeholder(tf.string, shape=[None])  # As befits `module`.
tensor_out = module(tensor_in)

# This kind of session handles init ops for you.
with tf.train.SingularMonitoredSession() as sess:
  for pystr in inputs:
    batch_in = np.array([pystr])
    batch_out = sess.run(tensor_out, feed_dict={tensor_in: batch_in}
    print(pystr, "--->", batch_out[0])

If your needs are too complex for with hub.eval_function_for_module ..., you could build out this more explicit example.

Notice how the hub.Module is neither constructed nor called in the loop.

PS: Tired of worrying about building graphs vs running sessions? Then TF2 and eager execution are for you. Check out https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/tf2_text_classification.ipynb

It worked, I still get an increase (from 1.5gb to 2gb), which I can not explain, but it seems much more managable. How does it work though? I thought python had its own form of "garbage collection" where if the object was not referenced anymore it would be deleted. Shouldn't this have happend? — Daniel Töws, Jun 07 '19 at 11:27
My first answer was incomplete. Should be much more satisfactory now. — arnoegw, Jun 07 '19 at 13:01
The example code with eval_function_for_module didn't work for me: I got the ````InternalError: Dst tensor is not initialized. [[{{node checkpoint_initializer_14}}]]```` — Yu Shen, Sep 23 '19 at 02:36

Strongly increasing memory consumption when using ELMo from Tensorflow-Hub

1 Answers1

Linked