2

I am new to tf. I have trained an encoder - decoder using tensorflow. The program takes as input a word and prints out its phonemes.

For example: Hello World -> ['h', 'E', 'l', '"', '@U', ' ', 'w', '"', '3`', 'r', '5', 'd']

I would like to have access to the prediction probability of each phoneme chosen.

In the prediction section, the code I am using is the following:

def predict(words, sess):

    if len(words) > hp.batch_size:
        after = predict(words[hp.batch_size:], sess)
        words = words[:hp.batch_size]

    else:
        after = []
    x = np.zeros((len(words), hp.maxlen), np.int32)  # 0: <PAD>
    for i, w in enumerate(words):
        for j, g in enumerate((w + "E")[:hp.maxlen]):
            x[i][j] = g2idx.get(g, 2)         


    preds = np.zeros((len(x), hp.maxlen), np.int32)
    for j in range(hp.maxlen):

        xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds})
        preds[:, j] = xpreds[:, j]

Thank you in advance!

My main problem is where these probabilities are "hidden" and how to access them. For example, the letter "o" in the word "Hello" was mapped with the phoneme "@U". I would like to find out with what probability "@U" was chosen as the ideal phoneme.

  • Can you format the code correctly? Not sure if everything in the code block is supposed to be within the `predict` function. What have you tried doing to print out the probabilities? Trying to determine if this is a duplicate question, or if there's more to it – IanQ Nov 21 '18 at 16:13
  • Thank you for your answer. Actually, I have tried some things found online. I have tried the np.argmax, tf.argmax, tf.nn.top_k and other commands. A part of the problem is that even if the above commands produce something, there is a problem of accessing and reading the data. Mostly because they are tensors – Konstantinos Markopoulos Nov 21 '18 at 16:27
  • Look at `tf.Print` or `eval` [stackoverflow questions with answers](https://stackoverflow.com/questions/33633370/how-to-print-the-value-of-a-tensor-object-in-tensorflow) – IanQ Nov 21 '18 at 16:32
  • Possible duplicate of [How to print the value of a Tensor object in TensorFlow?](https://stackoverflow.com/questions/33633370/how-to-print-the-value-of-a-tensor-object-in-tensorflow) – IanQ Nov 21 '18 at 16:32
  • My problem is not (only) how to interpret a tensor in a readable form. My main problem is where these probabilities are "hidden" and how to access them. For example, the letter "o" in the word "Hello" was mapped with the phoneme "@U". I would like to find out with what probability "@U" was chosen as the ideal phoneme. I know that it is hidden somewhere inside "xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds})", but don't know how to access this information – Konstantinos Markopoulos Nov 21 '18 at 16:39
  • I'm assuming you're taking the softmax then doing a tf argmax on that. You'd want to do a tf.Print before taking the argmax? – IanQ Nov 21 '18 at 16:42
  • Thank you for your answer. I tried tf.nn.softmax(preds) and after that tf.Print. It does not return anything in that case. – Konstantinos Markopoulos Nov 21 '18 at 16:54
  • Please copy paste your answer I have literally no idea what you're doing without seeing your code – IanQ Nov 21 '18 at 17:15
  • I have used many things. One of these is: for j in range(hp.maxlen): xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds}) preds[:, j] = xpreds[:, j] axa = tf.nn.softmax(preds) z = tf.Print(axa) za = tf.argmax(axa) print(axa) print(za) The output is nothing. It doesn't print anything. I also don't know in which variable should i look. Any help on that? Thank you! – Konstantinos Markopoulos Nov 21 '18 at 17:57
  • Do you have the complete training code (including the model definition) or are you using a frozen pb file? – Ohad Meir Nov 21 '18 at 18:58
  • 1
    The code that i use is based on: https://github.com/Kyubyong/g2p The things that I have changed are mostly on preprocessing, in order to use my own data. But the heart remains the same. I used train.py in order to train my model and I am using g2p.py to make predictions. The point of interest is on line 85 of g2p.py. I have to find a way to utilize the information in " _preds = sess.run(graph.preds, {graph.x: x, graph.y: preds}) " to not only make predictions but print the probability of each prediction – Konstantinos Markopoulos Nov 21 '18 at 19:09

1 Answers1

1

Following the discussion, I think I can point you to where the code should be changed. In train.py, line 104:

self.preds = tf.to_int32(tf.argmax(logits, -1))

They assign the preds variable to the index with highest probability. In order to get the softmax predictions, you can change the code as follows:

self.preds = tf.nn.softmax(logits)

I think that should do it.

How to view the probabilities:

preds = np.zeros((len(x), hp.maxlen), np.float32)
for j in range(hp.maxlen):
    xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds})
    # print shape of output -> batch_size, max_length,number_of_output_options
    print(xpreds.shape)
    # print all predictions of the first output
    print(xpreds[0, 0])
    # print the probabilty of the network prediction
    print(xpreds[0, 0, np.argmax(xpreds[0][0])])
    # preds[:, j] = _preds[:, j]     Need to accumulate the results according to the correct output shape
Ohad Meir
  • 714
  • 8
  • 18
  • thank you for your answer. I changed the code as you proposed and i am now retraining my model. any idea on how to access the probabilities when i use the g2p.py script, in order to find them on every data i use as input? thank you in advance – Konstantinos Markopoulos Nov 22 '18 at 08:48
  • After you make this change, after you run xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds}), the xpreds will contain the probabilities. Print to the console and check. – Ohad Meir Nov 22 '18 at 13:11
  • By the way, you don't need to retrain. you can just use the latest checkpoint that you already have – Ohad Meir Nov 22 '18 at 13:30
  • Thank you very much. One last question (i hope the last). I replaced "tf.to_int32" with "tf.nn.softmax" as you proposed above in the train.py. After that i run g2p.py and a ValueError occured: ValueError: could not broadcast input array from shape (97) into shape (1). Should i use an other manipulation in order to make my script run correctly? Thank you in advance – Konstantinos Markopoulos Nov 22 '18 at 14:21
  • The error is here: preds[:, j] = xpreds[:, j] First you can do: print(xpreds.shape) right after session.run to see the shape that you now get. you can also print(xpreds[0]) to see that you now get probabilties. I'll edit the answer – Ohad Meir Nov 22 '18 at 14:51
  • Thanks for accepting. hope it gets you a bit closer to the solution :) – Ohad Meir Nov 22 '18 at 17:18
  • Thank you very much. That really helped a lot :) – Konstantinos Markopoulos Nov 22 '18 at 22:23
  • Do you know why I take different results when using self.preds = tf.to_int32(tf.argmax(logits, -1)) and self.preds = tf.nn.softmax(logits) ?? What i mean is, for example, when I use argmax the result for the word 'gare' is ['g', '"', 'E@', 'r', '+'] and for the same word, when i use softmax, the result is ['g', 'E@', 'r', '0' ]. How can i make softmax return the same results as argmax and why is this happening? Thank you in advance! – Konstantinos Markopoulos Nov 23 '18 at 16:33
  • From looking at the code I can't say why you would get different results. Are you using the same checkpoint when checking the difference between the outputs? – Ohad Meir Nov 24 '18 at 09:10
  • Yes. They are all the same. The only line that i changed is that you suggested, replacing argmax with softmax. Thats very weird. – Konstantinos Markopoulos Nov 24 '18 at 10:53
  • when i run this: for j in range(hp.maxlen): xpreds = sess.run(graph.preds, {graph.x: x, graph.y: preds}) kzz = np.array(xpreds) kzz = np.reshape(kzz, (20,97), order='F') self.preds = tf.nn.softmax(logits) #print(kzz[1][:]) for i in range(len(kzz)): a1 = kzz[i][:] #print(a1) ff1 = np.argmax(a1) k1 = a1[ff1] print(ff1, k1) print(idx2p[ff1]) it takes the argmax of probabilities and maps them to wrong phonemes – Konstantinos Markopoulos Nov 24 '18 at 13:38
  • shouldnt i use np.argmax? Shall i use anything else for the mapping of the max probability? thank you – Konstantinos Markopoulos Nov 24 '18 at 13:39
  • I mean that, even if i copy your code in the answer, the phoneme "E@" has biggest probability than " " " which is the right answer when using argmax – Konstantinos Markopoulos Nov 24 '18 at 13:49
  • Well. I can suggest a way to debug this. change the code in train.py to: self.preds = logits This way you'll get the raw output and you can check what causes each result. To convert the output to softmax you can use for example: from sklearn.utils.extmath import softmax – Ohad Meir Nov 25 '18 at 07:45
  • Thank you for your help! I observed that the default code uses argmax with axis = -1. There must be the issue. Somehow i have to do softmax with axis = -1. How can i exploit this information? softmax with axis=-1 gives me the same bad results too – Konstantinos Markopoulos Nov 25 '18 at 16:06