3

I have started learning tensorflow and have difficulties understanding the placeholders/variables issues.

I am trying to write a function for matrix multiplication. It works when using tf.constant but I have difficulties understanding how to use variables

here is my code

import tensorflow as tf
import numpy as np 


mat_1 = np.array([[0,1,1,0], [1,0,1,0], [1,0,0,1], [0,1,1,0]]).astype('int32')
mat_2 = np.array([[0,1,1,0], [1,0,1,0], [1,0,0,1], [0,1,1,0]]).astype('int32')


def my_matmult1(mat_1, mat_2):
    #define session
    x_sess = tf.Session()

    with x_sess:
        xmat_1 = tf.constant(mat_1)
        xmat_2 = tf.constant(mat_2)
        r1 = tf.matmul(xmat_1, xmat_2)
        qq1 = x_sess.run(r1)

    return qq1    

def my_matmult2(mat_1, mat_2):
    #define session
    x_sess1 = tf.Session()

    with x_sess1:
        #initialize placeholders
        xmat_1_plh = tf.placeholder(dtype=mat_1.dtype, shape=mat_1.shape)
        xmat_2_plh = tf.placeholder(dtype=mat_2.dtype, shape=mat_2.shape)  

        #create variables
        x_mat_1 = tf.Variable(xmat_1_plh, trainable = False)
        x_mat_2 = tf.Variable(xmat_2_plh, trainable = False)

        x_sess1.run(tf.initialize_all_variables())

        #
        r1 = tf.matmul(xmat_1, xmat_2)
        qq1 = x_sess1.run(r1, feed_dic={mat_1, mat_2})

    return qq1  

This works as expected:

my_matmult1(mat_1, mat_1)

However, the following fails:

my_matmult2(mat_1, mat_1)

with the following error

InvalidArgumentError

You must feed a value for placeholder tensor 'Placeholder' with dtype int32 and shape [4,4]

Even after changing the last line in

qq1 = x_sess1.run(r1, feed_dic={tf.convert_to_tensor(mat_1), tf.convert_to_tensor(mat_2)})

What am I doing wrong?

Community
  • 1
  • 1
user1043144
  • 2,680
  • 5
  • 29
  • 45

3 Answers3

5

In order to meaningfully answer this question I have to go back to how tensorflow is designed to work

Graphs
Graphs in Tensorflow are just a map / path that the computation will take. It does not hold any values and does not execute anything.

Session
on the other hand, session needs a graph, data and a run-time to execute. This concept of Graphs and Sessions lets TensorFolow separate the flow definitions or models from the actual computation runtime.

Separating the run-time from flow graph
This was, most probably, done to separate out the graph definition from the run-time configurations and actual execution with data. E.g, the run-time can be on a cluster. So each of the execution run-times in the cluster needs to have the same definition of the graph. But each of the run-time might locally have a different set of data during the execution process. So it is important that input and output data can be supplied during the distributed execution in the cluster.

Why Placeholders and Not Variables
Placeholders act as input / output conduits for the Graphs. If you visualize your graph as a number of nodes - placeholders are input or output nodes.

Real question is why does TensorFlow not use a normal variable for the I/O nodes? Why have another type?

During the training process (when the program is executing in a session), it needs to be ensured that actual values are used to train a model. Basically feed_dict inside a training process would accept only actual values, e.g. a Numpy ndarry. These actual values can not be supplied by a TensorFlow variable, as Tensorflow variables do not have data unless eval() or session.run() is used. However the training statement itself is part of a session.run() function - Therefore it can not take another session.run() inside it to resolve the tensor variable to data. By this time, a session.run() already has to bind to a specific run time configuration and data.

Santanu Dey
  • 2,900
  • 3
  • 24
  • 38
4

Your code should work if you remove the tf.Variable() lines after you created the placeholders (and modify the name of the fed variables accordingly).

Placeholders are for variables that you want to feed your model with. Variables are for parameters of your model (like weights).

Therefore you correctly created two placeholders, but then you created additional variables for no reason, which probably messes something up in the Tensorflow graph.

The function would look like:

import tensorflow as tf
import numpy as np

def my_matmult2(mat_1, mat_2):
    #define session
    x_sess1=tf.Session()

    with x_sess1:
        #initialize placeholders
        xmat_1_plh = tf.placeholder(dtype=mat_1.dtype, shape=mat_1.shape)
        xmat_2_plh = tf.placeholder(dtype=mat_2.dtype, shape=mat_2.shape)  

        r1 = tf.matmul(xmat_1_plh, xmat_2_plh)

        x_sess1.run(tf.initialize_all_variables())

        #

        qq1 = x_sess1.run(r1, feed_dict={xmat_1_plh: mat_1 , xmat_2_plh: mat_2})

    return qq1 

mat_1=np.ones((5,5))
mat_2=np.ones((5,5))

b=my_matmult2(mat_1,mat_2)
print b
nbro
  • 15,395
  • 32
  • 113
  • 196
jeandut
  • 2,471
  • 4
  • 29
  • 56
  • thanks @jean. it works indeed. My idea (perhaps naive given that I am just starting with TF) was actually to replace some matrix operations I do in numpy with TF. I hope to gain (1) in speed - for large matrix and (2) to overcome out-of-core . But as I said I am just learning – user1043144 Jul 21 '16 at 09:37
1

Your not feeding the dictionary properly. You need to set the dictionary to the name of the Placeholder. I have also added a name, you might be able to use the "xmat_1_plh" as the name, but I prefer to add my own name. I also think you have some extra lines in the my_matmult2() function. x_mat_1/2 I don't think add much, but probably don't hurt (maybe a little performance by adding another OP to the graph.

def my_matmult2(mat_1, mat_2):
 #define session
 x_sess1 = tf.Session()

 with x_sess1:
    #initialize placeholders
    xmat_1_plh = tf.placeholder(dtype=mat_1.dtype, shape=mat_1.shape, name ="xmat1")
    xmat_2_plh = tf.placeholder(dtype=mat_2.dtype, shape=mat_2.shape, name ="xmat2")  

    #create variables
    x_mat_1 = tf.Variable(xmat_1_plh, trainable = False)
    x_mat_2 = tf.Variable(xmat_2_plh, trainable = False)

    x_sess1.run(tf.initialize_all_variables())

    #
    r1 = tf.matmul(xmat_1, xmat_2)
    qq1 = x_sess1.run(r1, feed_dic={xmat1: mat_1, xmat2: mat_2})

 return qq1  

I am not sure what your final goal is of this function, but you are creating nodes in the graph. Due to this, it is likely you want to move your ".run()" statement out of this function (to where you wish to actively multiply the 2 matrix) as you should not call this in a loop if you are just looking for a way to multiply 2 matrix.

If this is a single test/call to my_matmult2(), what you have should work with the correction to the dictionary.

mazecreator
  • 543
  • 1
  • 11
  • 27
  • Thanks but I am afraid, it still does not work. the function is not meant for production. just a little confused about placeholders/variables/constant and trying to learn – user1043144 Jul 11 '16 at 05:03