I implemented the softmax()
function, softmax_crossentropy()
and the derivative of softmax cross entropy: grad_softmax_crossentropy()
. Now I wanted to compute the derivative of the softmax cross entropy function numerically. I tried to do this by using the finite difference method but the function returns only zeros. Here is my code with some random data:
import numpy as np
batch_size = 3
classes = 10
# random preactivations
a = np.random.randint(1,100,(batch_size,classes))
# random labels
y = np.random.randint(0,np.size(a,axis=1),(batch_size,1))
def softmax(a):
epowa = np.exp(a-np.max(a,axis=1,keepdims=True))
return epowa/np.sum(epowa,axis=1,keepdims=True)
print(softmax(a))
def softmax_crossentropy(a, y):
y_one_hot = np.eye(classes)[y[:,0]]
return -np.sum(y_one_hot*np.log(softmax(a)),axis=1)
print(softmax_crossentropy(a, y))
def grad_softmax_crossentropy(a, y):
y_one_hot = np.eye(classes)[y[:,0]]
return softmax(a) - y_one_hot
print(grad_softmax_crossentropy(a, y))
# Finite difference approach to compute grad_softmax_crossentropy()
eps = 1e-5
print((softmax_crossentropy(a+eps,y)-softmax_crossentropy(a,y))/eps)
What did I wrong?