I was reading up on log-loss and cross-entropy, and it seems like there are 2 approaches for calculating it, based on the following equations.
The first one is the following.
import numpy as np
from sklearn.metrics import log_loss
def cross_entropy(predictions, targets):
N = predictions.shape[0]
ce = -np.sum(targets * np.log(predictions)) / N
return ce
predictions = np.array([[0.25,0.25,0.25,0.25],
[0.01,0.01,0.01,0.97]])
targets = np.array([[1,0,0,0],
[0,0,0,1]])
x = cross_entropy(predictions, targets)
print(log_loss(targets, predictions), 'our_answer:', ans)
The output of the previous program is 0.7083767843022996 our_answer: 0.71355817782
, which is almost the same. So that's not the issue.
The above implementation is the middle part of the equation above.
The second approach is based on the RHS part of the equation above.
res = 0
for act_row, pred_row in zip(targets, np.array(predictions)):
for class_act, class_pred in zip(act_row, pred_row):
res += - class_act * np.log(class_pred) - (1-class_act) * np.log(1-class_pred)
print(res/len(targets))
And the output is 1.1549753967602232
, which is not quite the same.
I have tried the same implementation with NumPy, but it also didn't work. What am I doing wrong?
PS: I am also curious that -y log (y_hat)
seems to me that it's same as - sigma(p_i * log( q_i))
then how come there is a -(1-y) log(1-y_hat)
part. Clearly I am misunderstanding how -y log (y_hat)
is to be calculated.