Model inference running time increases after repeated inferences

Question

I am coding a tensorflow project in which I am editing each weight and bias manually so I set up the weights and biases like in old tensorflow with dictionaries rather then using tf.layers.dense and letting tensorflow take care of updating weights. (This is the cleanest way I came up with although it might not be ideal)

I feed a fixed model the same data in each iteration, but the running time increases throughout program execution.

I cut out almost everything from my code so I can see where the issue lies but I cannot understand what is causing the increase in running time.

---Games took   2.6591222286224365 seconds ---
---Games took   3.290001153945923 seconds ---
---Games took   4.250034332275391 seconds ---
---Games took   5.190149307250977 seconds ---

Edit: I have managed to reduce the running time by using a placeholder that doesn't add additional nodes to the graph but the running time still increases as a slower rate. I'd like to remove this running time growth. (It goes from 0.1 to over 1 second after a while)

Here is my whole code:

import numpy as np
import tensorflow as tf
import time

n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases


weights, biases = generate_initial_population(population_size)
data = tf.placeholder(dtype=tf.float32) #will add shape later

def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game():


     model_input = [0] * 9
     model_out = model(data)

     for game_step in range(game_steps):

        move = sess.run(model_out, feed_dict={data: model_input})[0]


sess = tf.Session()
sess.run(tf.global_variables_initializer())
while True:
    start_time = time.time()
    for _ in range(int(games_in_generation)):
        play_game()
    print("---Games took   %s seconds ---" % (time.time() - start_time))

score 1 · Answer 1 · answered Jul 08 '18 at 02:33

There are some strange things going on in this code, so it's going to be tricky to give you an answer that really solves the underlying problem. I can, however, address the growth in running time that you're observing. Below, I've modified your code to extract the input pattern generation and calls to model from the game loop.

import numpy as np
import tensorflow as tf
import time

n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases


weights, biases = generate_initial_population(population_size)


def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game():

    # Extract input pattern generation.
    model_input = np.float32([[0]*9])
    model_out = model(model_input)

    for game_step in range(game_steps):

            start_time = time.time()
            move = sess.run(model_out)[0]

            # print("---Step took   %s seconds ---" % (time.time() - start_time))


sess = tf.Session()
sess.run(tf.global_variables_initializer())
for _ in range(5):
    start_time = time.time()
    for _ in range(int(games_in_generation)):
        play_game()
    print("---Games took   %s seconds ---" % (time.time() - start_time))

If ran, this code should give you something like:

---Games took   0.42223644256591797 seconds ---
---Games took   0.13168787956237793 seconds ---
---Games took   0.2452383041381836 seconds ---
---Games took   0.20023465156555176 seconds ---
---Games took   0.19905781745910645 seconds ---

Clearly this resolves the running time growth you're observing. It also reduces the maximum observed running time by an order of magnitude! The reason this was occuring is that every time you called model you were actually creating a bunch of tf.Tensor objects, which you were attempting to add to the graph. This misunderstanding is common, and is caused by the fact that you're trying to use Tensors in imperative python code, as though they were python variables. I recommend reviewing all of the graphs guide before proceeding.

It's also important to note that this is not the correct way to pass a value to a graph in TensorFlow. I can see that you want to pass a different value to your model during every iteration of the game, but you can't accomplish this by passing a value into a python function. You must create a tf.placeholder in your model graph, and load the value you want your model to process onto that placeholder. There are many ways to do this, but you can find one example here. I hope you find this helpful!

Thanks! I was aware i was supposed to be passing a placeholder but in this case it wasnt effecting the performance in this case correct? — Xitcod13, Jul 08 '18 at 02:38
Nope, you actually were impacting the running time of the game step because your call to `model` was creating `tf.Tensor`s, which takes time. — Justin Fletcher, Jul 08 '18 at 02:40
If you refactor your code to use a `feed_dict` argument to `sess.run` and a placeholder for your model input, everything should perform as you expect it to. — Justin Fletcher, Jul 08 '18 at 02:42
Thanks it turns out I deleted the placeholder when making my code shorter. There is still some issue with the code that causes performance loss. (although the loss is not as significant so it takes longer to notice) — Xitcod13, Jul 08 '18 at 03:00

score 1 · Accepted Answer · answered Jul 08 '18 at 04:35

I'm adding another answer because the most recent edit to the question created a substantive change. You're still seeing growth in running time because you're still calling model more than once in a sess. You just reduced the frequency with which you added nodes to the graph. What you need to do is create a new session for each model you want to build, and close each session when you're done with it. I've modified your code to do so, here:

import numpy as np
import tensorflow as tf
import time


n_inputs = 9
n_class = 9

n_hidden_1 = 20

population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster

# 2 games per individual
games_in_generation = population_size/2


def generate_initial_population(my_population_size):
    my_weights = []
    my_biases = []

    for key in range(my_population_size):
        layer_weights = {
            'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
        }
        layer_biases = {
            'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
            'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
        }
        my_weights.append(layer_weights)
        my_biases.append(layer_biases)
    return my_weights, my_biases



def model(x):
    out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
    return out_layer


def play_game(sess):

    model_input = [0] * 9

    model_out = model(data)

    for game_step in range(game_steps):

        move = sess.run(model_out, feed_dict={data: model_input})[0]

while True:

    for _ in range(int(games_in_generation)):

        # Reset the graph.
        tf.reset_default_graph()

        weights, biases = generate_initial_population(population_size)
        data = tf.placeholder(dtype=tf.float32) #will add shape later

        # Create session.
        with tf.Session() as sess:

            sess.run(tf.global_variables_initializer())

            start_time = time.time()

            play_game(sess)

            print("---Games took   %s seconds ---" % (time.time() - start_time))

            sess.close()

What I did here is wrap the call to play_game in a session defined in a with scope, and exited that session with sess.close after the call to play_game. I also reset the default graph. I've run this for several hundred iterations, and have seen no increase in running time.

Thank you so much. And thanks for not being mean about it. I am still trying to learn and this help so much — Xitcod13, Jul 08 '18 at 06:09
You're very welcome! I wish you nothing but success, and I'm glad this has helped you a little along the way. — Justin Fletcher, Jul 08 '18 at 06:13

Model inference running time increases after repeated inferences

2 Answers2