I have been trying to understand the softmax, and came up with below simple example.
def simpleSoftmax(allValues):
return np.exp(allValues) / np.sum(np.exp(allValues), axis=0)
Invoke
simpleSoftmax([3,2,4])
array([ 0.24472847, 0.09003057, 0.66524096])
In this case 0.66 has higher probability. Understood.
Now, this shall be done like
(3/9)*100 = 33.33
(2/9)*100 = 22.22
(4/9)*100 = 44.44
Now if we see 44.44 takes higher value, and which results same as softmax.
I am sure there is something interesting behind this softmax with respect to legacy averaging. However i dont understand what is that going to make difference between these two ways?.