Neural network cost function implementation

Question

I am implementing neural network to train hand written digits in python. Following is the cost function,

In log(1-(h(x)), if h(x) is 1, then it would result in log(1-1), i.e. log(0). So I'm getting math error.

Im initializing the weights randomly between 10-60. I'm not sure where I have to change and what I have to change!

score 1 · Accepted Answer · answered Jan 23 '18 at 09:04

1

In this formula, h(x) is usually a sigmoid: h(x)=sigmoid(x), so it's never exactly 1.0, unless the activations in the network are too large (which is bad and will cause problems anyway). The same problem is possible with log(h(x)) when h(x)=0, i.e., when x is a large negative number.

If you don't want to worry about numerical issues, simply add a small number before computing the log: log(h(x) + 1e-10).

Other issues:

Weight initialization in a range [10, 60] doesn't look right, they should better be small random numbers, e.g., from [-0.01, 0.01].
The formula above is computing binary cross-entropy loss. If you're working with MNIST, it has 10 classes, so the loss must be multi-class cross-entropy. See this question for details.

answered Jan 23 '18 at 09:04

Maxim

52,561
27
155
209

thank you... it worked...my input consist of only 0 and 255 as they are black and white images and now i changed all 255 to 0. can you suggest me how many hidden nodes and layers can i use for my network? my input layer has 7990 nodes(85x94) and output layer has 10 nodes.. – Gokulakannan Jan 24 '18 at 17:06
First, I'd suggest you to rescale your input to `[0, 1]`. As for the layers and their size, it really depends, you can start with 2 layers and cross-validate if more helps – Maxim Jan 24 '18 at 17:08
sorry for the typo! I have changed all 255 to one not zero. Thank you for the help... – Gokulakannan Jan 25 '18 at 02:59

Neural network cost function implementation

1 Answers1