0

I am implementing neural network to train hand written digits in python. Following is the cost function, enter image description here

In log(1-(h(x)), if h(x) is 1, then it would result in log(1-1), i.e. log(0). So I'm getting math error.

Im initializing the weights randomly between 10-60. I'm not sure where I have to change and what I have to change!

Maxim
  • 52,561
  • 27
  • 155
  • 209
Gokulakannan
  • 128
  • 14

1 Answers1

1

In this formula, h(x) is usually a sigmoid: h(x)=sigmoid(x), so it's never exactly 1.0, unless the activations in the network are too large (which is bad and will cause problems anyway). The same problem is possible with log(h(x)) when h(x)=0, i.e., when x is a large negative number.

If you don't want to worry about numerical issues, simply add a small number before computing the log: log(h(x) + 1e-10).

Other issues:

  • Weight initialization in a range [10, 60] doesn't look right, they should better be small random numbers, e.g., from [-0.01, 0.01].
  • The formula above is computing binary cross-entropy loss. If you're working with MNIST, it has 10 classes, so the loss must be multi-class cross-entropy. See this question for details.
Maxim
  • 52,561
  • 27
  • 155
  • 209
  • thank you... it worked...my input consist of only 0 and 255 as they are black and white images and now i changed all 255 to 0. can you suggest me how many hidden nodes and layers can i use for my network? my input layer has 7990 nodes(85x94) and output layer has 10 nodes.. – Gokulakannan Jan 24 '18 at 17:06
  • First, I'd suggest you to rescale your input to `[0, 1]`. As for the layers and their size, it really depends, you can start with 2 layers and cross-validate if more helps – Maxim Jan 24 '18 at 17:08
  • sorry for the typo! I have changed all 255 to one not zero. Thank you for the help... – Gokulakannan Jan 25 '18 at 02:59