In machine learning and information theory, the cross entropy is a measure of distance (inverse similarity) between two probability distributions over the same underlying set of events. Cross entropy is the common choice of the loss function in neural networks for classification tasks.
Questions tagged [cross-entropy]
360 questions
429
votes
10 answers
What is the meaning of the word logits in TensorFlow?
In the following TensorFlow function, we must feed the activation of artificial neurons in the final layer. That I understand. But I don't understand why it is called logits? Isn't that a mathematical function?
loss_function =…

Milad P.
- 4,707
- 3
- 12
- 9
126
votes
3 answers
What's the difference between sparse_softmax_cross_entropy_with_logits and softmax_cross_entropy_with_logits?
I recently came across tf.nn.sparse_softmax_cross_entropy_with_logits and I can not figure out what the difference is compared to tf.nn.softmax_cross_entropy_with_logits.
Is the only difference that training vectors y have to be one-hot encoded when…

daniel451
- 10,626
- 19
- 67
- 125
115
votes
3 answers
What is cross-entropy?
I know that there are a lot of explanations of what cross-entropy is, but I'm still confused.
Is it only a method to describe the loss function? Can we use gradient descent algorithm to find the minimum using the loss function?

theateist
- 13,879
- 17
- 69
- 109
101
votes
3 answers
How to choose cross-entropy loss in TensorFlow?
Classification problems, such as logistic regression or multinomial
logistic regression, optimize a cross-entropy loss.
Normally, the cross-entropy layer follows the softmax layer,
which produces probability distribution.
In tensorflow, there are at…

Maxim
- 52,561
- 27
- 155
- 209
61
votes
3 answers
In which cases is the cross-entropy preferred over the mean squared error?
Although both of the above methods provide a better score for the better closeness of prediction, still cross-entropy is preferred. Is it in every case or there are some peculiar scenarios where we prefer cross-entropy over MSE?

Amogh Mishra
- 1,088
- 1
- 16
- 25
57
votes
2 answers
What is the difference between a sigmoid followed by the cross entropy and sigmoid_cross_entropy_with_logits in TensorFlow?
When trying to get cross-entropy with sigmoid activation function, there is a difference between
loss1 = -tf.reduce_sum(p*tf.log(q), 1)
loss2 = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(labels=p, logits=logit_q),1)
But they are the…

D.S.H.J
- 581
- 1
- 7
- 6
41
votes
1 answer
What is the difference between cross-entropy and log loss error?
What is the difference between cross-entropy and log loss error? The formulae for both seem to be very similar.

user3303020
- 933
- 2
- 12
- 26
40
votes
2 answers
What are the differences between all these cross-entropy losses in Keras and TensorFlow?
What are the differences between all these cross-entropy losses?
Keras is talking about
Binary cross-entropy
Categorical cross-entropy
Sparse categorical cross-entropy
While TensorFlow has
Softmax cross-entropy with logits
Sparse softmax…

ScientiaEtVeritas
- 5,158
- 4
- 41
- 59
32
votes
1 answer
How does binary cross entropy loss work on autoencoders?
I wrote a vanilla autoencoder using only Dense layer.
Below is my code:
iLayer = Input ((784,))
layer1 = Dense(128, activation='relu' ) (iLayer)
layer2 = Dense(64, activation='relu') (layer1)
layer3 = Dense(28, activation ='relu') (layer2)
layer4 =…

Whoami
- 13,930
- 19
- 84
- 140
30
votes
3 answers
python - invalid value encountered in log
I have the following expression:
log = np.sum(np.nan_to_num(-y*np.log(a+ 1e-7)-(1-y)*np.log(1-a+ 1e-7)))
it is giving me the following warning:
RuntimeWarning: invalid value encountered in log
log = np.sum(np.nan_to_num(-y*np.log(a+…

helix
- 1,017
- 3
- 12
- 30
25
votes
2 answers
How does TensorFlow SparseCategoricalCrossentropy work?
I'm trying to understand this loss function in TensorFlow but I don't get it. It's SparseCategoricalCrossentropy. All other loss functions need outputs and labels of the same shape, this specific loss function doesn't.
Source code:
import tensorflow…

Dee
- 7,455
- 6
- 36
- 70
25
votes
2 answers
What is the problem with my implementation of the cross-entropy function?
I am learning the neural network and I want to write a function cross_entropy in python. Where it is defined as
where N is the number of samples, k is the number of classes, log is the natural logarithm, t_i,j is 1 if sample i is in class j and 0…

Jassy.W
- 539
- 2
- 9
- 16
21
votes
1 answer
PyTorch LogSoftmax vs Softmax for CrossEntropyLoss
I understand that PyTorch's LogSoftmax function is basically just a more numerically stable way to compute Log(Softmax(x)). Softmax lets you convert the output from a Linear layer into a categorical probability distribution.
The pytorch…

JacKeown
- 2,780
- 7
- 26
- 34
20
votes
4 answers
PyTorch equivalence for softmax_cross_entropy_with_logits
I was wondering is there an equivalent PyTorch loss function for TensorFlow's softmax_cross_entropy_with_logits?

Dark_Voyager
- 323
- 1
- 2
- 8
15
votes
1 answer
About tf.nn.softmax_cross_entropy_with_logits_v2
I have noticed that tf.nn.softmax_cross_entropy_with_logits_v2(labels, logits) mainly performs 3 operations:
Apply softmax to the logits (y_hat) in order to normalize them: y_hat_softmax = softmax(y_hat).
Compute the cross-entropy loss: y_cross =…

lifang
- 1,485
- 3
- 16
- 23