What does an activation function do in Neural networks - for a beginner

Question

I understand the concept of having muliple layers, backpropagation, etc. I even understand that an activation function would squash the output to a certain range based on the activation function used. But why do we even require this? What happens if we continue with the actual result without an activation function?

Please help me understand, but in pure english - no graphs/formulas please - i want to understand the concept behind it

Possible duplicate of [Why must a nonlinear activation function be used in a backpropagation neural network?](https://stackoverflow.com/questions/9782071/why-must-a-nonlinear-activation-function-be-used-in-a-backpropagation-neural-net) — Dr. Snoopy, Jan 26 '18 at 08:12

score 0 · Answer 1 · answered Jan 26 '18 at 06:21

0

There are few reasons to use activation function, the most common one is when the output needs to be within certain range by its nature. e.g. if the output is a probability, which is only valid in range [0, 1].

answered Jan 26 '18 at 06:21

Fermat's Little Student

5,549
7
49
70

score 0 · Accepted Answer · answered Jan 26 '18 at 07:00

0

If your activation function is just a(z)=z (a linear neuron), the activation is just the weighted input (plus bias). In this case, the activation of each layer is a linear function of the previous layer's activation. You can quite easily convince yourself that the combined effect of many layers (i.e. a deep network) is still a linear function. That means that you could get exactly the same result with just an input layer and an output layer, without any hidden neurons. In other words, you would not win any additional complexity in what your network can do by adding hidden layers, so no advantage going to "deep" neural networks.

answered Jan 26 '18 at 07:00

rain city

227
1
7

THankyou! Makes sense. So can i come up with my own activation function instead of going with the standard (one of the types of) reLu, or tanh, etc? Given that my activation function is zero mean, easy on computation? My question is just to clarify more on the same question – Ravi Jan 28 '18 at 00:20
If you have a good idea, you could try something else. Doesn't even necessarily have to be antisymmetric around zero (I assume that's what you mean with zero mean) - rectified linear units aren't, either. – rain city Jan 28 '18 at 01:07

What does an activation function do in Neural networks - for a beginner

2 Answers2