Activation function is a non-linear transformation, usually applied in neural networks to the output of the linear or convolutional layer. Common activation functions: sigmoid, tanh, ReLU, etc.
Questions tagged [activation-function]
343 questions
102
votes
2 answers
What is the intuition of using tanh in LSTM?
In an LSTM network (Understanding LSTMs), why does the input gate and output gate use tanh?
What is the intuition behind this?
It is just a nonlinear transformation? If it is, can I change both to another activation function (e.g., ReLU)?

DNK
- 1,448
- 2
- 13
- 12
58
votes
2 answers
How to make a custom activation function with only Python in Tensorflow?
Suppose you need to make an activation function which is not possible using only pre-defined tensorflow building-blocks, what can you do?
So in Tensorflow it is possible to make your own activation function. But it is quite complicated, you have to…

patapouf_ai
- 17,605
- 13
- 92
- 132
29
votes
3 answers
Activation function for output layer for regression models in Neural Networks
I have been experimenting with neural networks these days. I have come across a general question regarding the activation function to use. This might be a well known fact to but I couldn't understand properly. A lot of the examples and papers I have…

user7400738
- 443
- 2
- 6
- 18
21
votes
5 answers
Why use softmax only in the output layer and not in hidden layers?
Most examples of neural networks for classification tasks I've seen use the a softmax layer as output activation function. Normally, the other hidden units use a sigmoid, tanh, or ReLu function as activation function. Using the softmax function here…

beyeran
- 885
- 1
- 8
- 26
16
votes
1 answer
Does pytorch apply softmax automatically in nn.Linear
In pytorch a classification network model is defined as this,
class Net(torch.nn.Module):
def __init__(self, n_feature, n_hidden, n_output):
super(Net, self).__init__()
self.hidden = torch.nn.Linear(n_feature, n_hidden) #…

yujuezhao
- 1,015
- 3
- 11
- 21
13
votes
3 answers
Why does the gated activation function (used in Wavenet) work better than a ReLU?
I have recently been reading the Wavenet and PixelCNN papers, and in both of them they mention that using gated activation functions work better than a ReLU. But in neither cases they offer an explanation as to why that is.
I have asked on other…

Ahmad Moussa
- 876
- 10
- 31
13
votes
2 answers
What is the difference between a layer with a linear activation and a layer without activation?
I'm playing with Keras a little bit and I'm thinking about what is the difference between linear activation layer and no activation layer at all? Doesn't it have the same behavior? If so, what's the point of linear activation then?
I mean the…

T.Poe
- 1,949
- 6
- 28
- 59
13
votes
3 answers
Pytorch custom activation functions?
I'm having issues with implementing custom activation functions in Pytorch, such as Swish. How should I go about implementing and using custom activation functions in Pytorch?

ZeroMaxinumXZ
- 357
- 2
- 6
- 21
12
votes
2 answers
Tensorflow error: Using a `tf.Tensor` as a Python `bool` is not allowed
I am struggling to implement an activation function in tensorflow in Python.
The code is the following:
def myfunc(x):
if (x > 0):
return 1
return 0
But I am always getting the error:
Using a tf.Tensor as a Python bool is not…

Lilo
- 640
- 1
- 9
- 22
12
votes
1 answer
List of activation functions in C#
I can find a list of activation functions in math but not in code.
So i guess this would be the right place for such a list in code if there ever should be one.
starting with the translation of the algorithms in these 2…

Rottjung
- 493
- 1
- 5
- 15
11
votes
1 answer
Why is ReLU a non-linear activation function?
As I understand it, in a deep neural network, we use an activation function (g) after applying the weights (w) and bias(b) (z := w * X + b | a := g(z)). So there is a composition function of (g o z) and the activation function makes so our model…

FlyingZipper
- 701
- 6
- 26
9
votes
2 answers
How to implement the derivative of Leaky Relu in python?
How would I implement the derivative of Leaky ReLU in Python without using Tensorflow?
Is there a better way than this? I want the function to return a numpy array
def dlrelu(x, alpha=.01):
# return alpha if x < 0 else 1
return np.array…

Lécio Bourbon
- 163
- 1
- 2
- 8
9
votes
2 answers
How to use different activation functions in one Keras layer?
I am working on Keras in Python and I have a neural network (see code below).
Currently it works with only a ReLu activation.
For experimental reasons I would like to have some neurons on ReLu and some on softmax (or any other activation function).…

Nicolas
- 392
- 5
- 14
8
votes
1 answer
correct order for SpatialDropout2D, BatchNormalization and activation function?
For a CNN architecture I want to use SpatialDropout2D layer instead of Dropout layer.
Additionaly I want to use BatchNormalization.
So far I had always set the BatchNormalization directly after a Convolutional layer but before the activation…

Code Now
- 711
- 2
- 9
- 20
8
votes
5 answers
How do I implement leaky relu using Numpy functions
I am trying to implement leaky Relu, the problem is I have to do 4 for loops for a 4 dimensional array of input.
Is there a way that I can do leaky relu only using Numpy functions?

Liu Hantao
- 620
- 1
- 9
- 19