I'm training a CNN on tensorflow but I'm having trouble with my loss that is not improving; I've noticed that tf.nn.softmax()
is returning a tensor with only 0 and 1 and not a distribution as I'd expect. Here's the repo, I believe that's the reason of my incapability of training the network but I don't know how to solve it.
Asked
Active
Viewed 3,254 times
0

Pratik Kumar
- 2,211
- 1
- 17
- 41

Alessandro Gaballo
- 708
- 3
- 13
- 28
1 Answers
2
Looking at the 2nd box under The Neural Network:
# output layer
with tf.variable_scope('output_lay') as scope:
weights = weight_variable([4096, CLASSES])
bias = bias_variable([CLASSES], 0.)
activation = tf.nn.relu(tf.matmul(out, weights)+bias, name=scope.name)
out = tf.nn.softmax(activation)
return tf.reshape(out, [-1, CLASSES])
NB : ReLu
activation is only used for hidden layers not output layer.
Then you are feeding this to cross-entropy in your train
function
logits=AlexNet(x_tr)
# loss function
cross_entropy = -tf.reduce_sum(tf.squeeze(y_tr)*tf.log(tf.clip_by_value(tf.squeeze(logits),1e-10,1.0)))
loss = tf.reduce_mean(cross_entropy)
Re-visiting cross entropy:
C= −1/n * (∑[y*ln(a)+(1−y)*ln(1−a)])
where a = sigmoid(W(x)+b)
, So I suggest :
with tf.variable_scope('output_lay') as scope:
weights = weight_variable([4096, CLASSES])
bias = bias_variable([CLASSES], 0.)
return tf.matmul(out, weights)+bias
and for simplicity just use inbuilt softmax function:
logits=AlexNet(x_tr)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=ground_truth_input, logits=logits)
loss = tf.reduce_mean(cross_entropy)
tf.nn.softmax_cross_entropy_with_logits
takes in W(x)+b
and efficiently calculates the cross entropy.

Pratik Kumar
- 2,211
- 1
- 17
- 41
-
I think you're right about the ReLu in the output layer, I've tried using `tf.nn.softmax_cross_entropy_with_logits` but I get `nan` for the loss. If I use my own way to compute it, then the problem of the softmax giving only 0 or 1 remains – Alessandro Gaballo May 04 '18 at 19:48
-
@AlessandroGaballo is your `ground_truth_input` **One-hot** encoded ? – Pratik Kumar May 04 '18 at 20:06
-
Yes, In the `decode()` method I do `label = tf.one_hot(label, 10)` – Alessandro Gaballo May 04 '18 at 20:08
-
In that case just input the raw `W(x)+b` to `tf.nn.softmax` (it internally takes care of applying sigmoid to `W(x)+b`)and see if that helps. [This](https://stackoverflow.com/questions/34240703/whats-the-difference-between-softmax-and-softmax-cross-entropy-with-logits#answer-39499486) may be of some help. – Pratik Kumar May 04 '18 at 20:27
-
As I've mentioned, if I output `tf.nn.softmax()` and compute the cross entropy manually, I get all 0 and 1 in the predictions, if I use `tf.nn.softmax_cross_entropy_with_logits()` to compute the loss, I get nan. In both cases the loss doesn't improve – Alessandro Gaballo May 04 '18 at 20:51