I was trying to build a custom activation function using tflearn by making following changes:
add my custom activation function to activation.py
def my_activation(x):
return tf.where(x >= 0.0, tf.div( x**2 , x + tf.constant(0.6)) , 0.01*x)
and add it to the __init__.py
from .activations import linear, tanh, sigmoid, softmax, softplus, softsign,\
relu, relu6, leaky_relu, prelu, elu, crelu, selu, my_activation
Since tensorflow can perform the gradient calculation automatically, I don't need to implement the gradiate function. As pointed in the article Deep Learning Programming Style,
In the past, whenever someone defined a new model, they had to work out the derivative calculations by hand. While the math is reasonably straightforward, for complex models, it can be time-consuming and tedious work. All modern deep learning libraries make the practitioner/researcher’s job much easier, by automatically solving the problem of gradient calculation.
I trained the model on cifar10 dataset using this code: https://github.com/tflearn/tflearn/blob/master/examples/images/convnet_cifar10.py but changed all relu activations to my_activation.
Sadly, this simple modification cause the network fail to learn anything:
Training Step: 46 | total loss: 0.00002 | time: 1.434s
| Adam | epoch: 001 | loss: 0.00002 - acc: 0.0885 -- iter: 04416/50000
Training Step: 47 | total loss: 0.00002 | time: 1.448s
| Adam | epoch: 001 | loss: 0.00002 - acc: 0.0945 -- iter: 04512/50000
Training Step: 48 | total loss: 0.00001 | time: 1.462s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0927 -- iter: 04608/50000
Training Step: 49 | total loss: 0.00001 | time: 1.476s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0896 -- iter: 04704/50000
Training Step: 50 | total loss: 0.00001 | time: 1.491s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0919 -- iter: 04800/50000
Training Step: 51 | total loss: 0.00001 | time: 1.504s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0890 -- iter: 04896/50000
Training Step: 52 | total loss: 0.00001 | time: 1.518s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0944 -- iter: 04992/50000
Training Step: 53 | total loss: 0.00001 | time: 1.539s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0989 -- iter: 05088/50000
Training Step: 54 | total loss: 0.00001 | time: 1.553s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0951 -- iter: 05184/50000
Training Step: 55 | total loss: 0.00000 | time: 1.567s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0964 -- iter: 05280/50000
Training Step: 56 | total loss: 0.00000 | time: 1.580s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0931 -- iter: 05376/50000
Training Step: 57 | total loss: 0.00000 | time: 1.594s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0903 -- iter: 05472/50000
Training Step: 58 | total loss: 0.00000 | time: 1.613s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0851 -- iter: 05568/50000
Training Step: 59 | total loss: 0.00000 | time: 1.641s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0835 -- iter: 05664/50000
Training Step: 60 | total loss: 0.00000 | time: 1.674s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0834 -- iter: 05760/50000
Since I am just a beginner, I don't know the reason causing the network became both zero loss and low accuracy (NaN output? Deadweight?). Can anybody tell me how to fix this? thanks!
Please Note that I'm not asking how to build a custom activation function. Questions about how to make a custom function:
- How to make a custom activation function with only Python in Tensorflow?
- How to make a piecewise activation function with Python in TensorFlow?
- How do you create a custom activation function with Keras?
- How do I use custom activation functions in TensorFlow for Neural Networks?
- Is it possible to add new activation functions to TensorFlow / Theano / Torch? How?