I have been trying to create a small neural network to learn softmax function with an article from the following website: https://mlxai.github.io/2017/01/09/implementing-softmax-classifier-with-vectorized-operations.html
It works well for a single iteration. But, when I create a loop for training the network with updated weights, I get the following error: ValueError: operands could not be broadcast together with shapes (5,10) (1,5) (5,10). I have attached a screenshot of the output here.
Debugging this issue, I found out that np.max() returns array of shape (5,1) and (1,5) at different iterations even though the axis is being set to 1. Please help me in identifying what went wrong in the following code.
import numpy as np
N = 5
D = 10
C = 10
W = np.random.rand(D,C)
X = np.random.randint(255, size = (N,D))
X = X/255
y = np.random.randint(C, size = (N))
#print (y)
lr = 0.1
for i in range(100):
print (i)
loss = 0.0
dW = np.zeros_like(W)
N = X.shape[0]
C = W.shape[1]
f = X.dot(W)
#print (f)
print (np.matrix(np.max(f, axis=1)))
print (np.matrix(np.max(f, axis=1)).T)
f -= np.matrix(np.max(f, axis=1)).T
#print (f)
term1 = -f[np.arange(N), y]
sum_j = np.sum(np.exp(f), axis=1)
term2 = np.log(sum_j)
loss = term1 + term2
loss /= N
loss += 0.5 * reg * np.sum(W * W)
#print (loss)
coef = np.exp(f) / np.matrix(sum_j).T
coef[np.arange(N),y] -= 1
dW = X.T.dot(coef)
dW /= N
dW += reg*W
W = W - lr*dW