28

I'm creating a neural network using the backpropagation technique for learning.

I understand we need to find the derivative of the activation function used. I'm using the standard sigmoid function

f(x) = 1 / (1 + e^(-x))

and I've seen that its derivative is

dy/dx = f(x)' = f(x) * (1 - f(x))

This may be a daft question, but does this mean that we have to pass x through the sigmoid function twice during the equation, so it would expand to

dy/dx = f(x)' = 1 / (1 + e^(-x)) * (1 - (1 / (1 + e^(-x))))

or is it simply a matter of taking the already calculated output of f(x), which is the output of the neuron, and replace that value for f(x)?

nbro
  • 15,395
  • 32
  • 113
  • 196
rflood89
  • 694
  • 2
  • 6
  • 11
  • I would suggest trying to take the derivative yourself. With a bit of algebra you can derive exactly f(x) * (1 - f(x)), and then you'll understand exactly what is going on. (And the answer below are 100% correct.) – Nathan S. May 18 '12 at 06:15
  • think of your original problem in terms of substitution and you'll see that f(x) is a common term you can factor out via substitution – Brian Jack Jul 13 '13 at 18:02

4 Answers4

45

Dougal is correct. Just do

f = 1/(1+exp(-x))
df = f * (1 - f)
Bruno Kim
  • 2,300
  • 4
  • 17
  • 27
14

The two ways of doing it are equivalent (since mathematical functions don't have side-effects and always return the same input for a given output), so you might as well do it the (faster) second way.

Danica
  • 28,423
  • 6
  • 90
  • 122
6

A little algebra can simplify this so that you don't have to have df call f.
df = exp(-x)/(1+exp(-x))^2

derivation:

df = 1/(1+e^-x) * (1 - (1/(1+e^-x)))
df = 1/(1+e^-x) * (1+e^-x - 1)/(1+e^-x)
df = 1/(1+e^-x) * (e^-x)/(1+e^-x)
df = (e^-x)/(1+e^-x)^2
  • i would put explanation in comment and just have a single line function returning derived value (as you have calculated in last line) – shantanu pathak May 24 '19 at 05:59
4

You can use the output of your sigmoid function and pass it to your SigmoidDerivative function to be used as the f(x) in the following:

dy/dx = f(x)' = f(x) * (1 - f(x))
Murphy
  • 3,827
  • 4
  • 21
  • 35
jarwal
  • 41
  • 1