2

I am trying to code my own neural networks, but there is something I don't understand about bias terms. I know each link of a neuron to another neuron has a weight, but does a link between a bias and the neuron its connected to also have a weight? Or can I think of the weight always 1 and never gets changed?

Thanks

omega
  • 40,311
  • 81
  • 251
  • 474
  • 2
    What is *always 1 and never changes* is the ***input value*** of the bias, not it's weight. – Luis Jun 27 '16 at 17:43

1 Answers1

5

The bias terms do have weights, and typically, you add bias to every neuron in the hidden layers as well as the neurons in the output layer (prior to squashing).

Have a look at the basic structure of Artificial Neurons, you see the bias is added as wk0 = bk. For more thorough examples, see e.g. this link, containing formulas as well as visualisation of multi-layered NN.

For the discussion of choice of weights, refer to the following stats.stackexchange thread:

Community
  • 1
  • 1
dfrib
  • 70,367
  • 12
  • 127
  • 192
  • Does a bias term get added to every non-input layer? – omega Dec 25 '15 at 01:04
  • I mean it gets added to each layer except the output layer. Then each one connects to all the neurons in the next layer to the right. – omega Dec 25 '15 at 01:05
  • Yes, typically you add bias to every hidden layer. – dfrib Dec 25 '15 at 01:06
  • What about a bias connected to each neuron in the output layer? – omega Dec 25 '15 at 01:07
  • In here http://stackoverflow.com/questions/7175099/why-bias-is-necessory-in-nn-and-should-we-have-seperate-bias-for-each-layer they say `Generally speaking, having bias weights going to every non-input unit is a good idea, since otherwise those units without bias weights would have thresholds that will always be zero`. – omega Dec 25 '15 at 01:09
  • Yes, generally you add bias at least to every neuron in the hidden layers, as well as to the neurons in the output layer (prior to squashing/using sigmoid). See the following link for some valuable visualisations: http://home2.fvcc.edu/~dhicketh/LinearAlgebra/studentprojects/fall2003/JohnAdam/neuralnets.htm – dfrib Dec 25 '15 at 01:11
  • Also, for initial weights when you make the NN, can the weights range from -1 to 1, or does it have to be 0 to 1? – omega Dec 25 '15 at 01:12
  • Usually you want to use symmetric span weights: if your inputs are normalised to mean 0, then using symmetric weights will yield an output with mean 0. See http://stats.stackexchange.com/questions/47590/what-are-good-initial-weights-in-a-neural-network for details. Also, regarding choice of weight *range* (or constant `c` in squashing function), see http://stackoverflow.com/questions/34120076/calculating-weights-in-a-nn/34122847#34122847 – dfrib Dec 25 '15 at 01:17
  • I saw that but I'm still not sure. If my activation function values map to [-1 ,1] like a logistic or hyperbolic tangent, and the initial weights range from [-1,1] and all input values range from [-1,1], then would that be the best way to do it? Basically keep everything normalized and from -1 to 1? – omega Dec 25 '15 at 01:43
  • 1
    This is turning more into a discussion than questions about the answer. If you have a specific problem, it's probably better that your ask a new question where you describe this in detail, as well as your thought about how to set the weights. Anyway, last answer in this discussion: generally the activation function map into the range [0,1], and to make "full use" of this range, naturally negative weights should be allowed (given random uniform input). If your problem uses special kinds of squashing functions and so on, you'll have to adapt. Please see the 2nd thread I linked to above, ... – dfrib Dec 25 '15 at 01:55
  • 1
    ... regarding "there is no free lunch". (Machine learning: each problem needs its own fine-tuning and reasoning) – dfrib Dec 25 '15 at 01:55