May I Learn Some Details about Implementing a Custom Activation Function in Keras?

Question

@patapouf_ai

Relating to How to make a custom activation function with only Python in Tensorflow?

I am a newcomer to Python, keras, and tf. I implemented a piece-wise constant custom activation function using the method above as follows


import tensorflow as tf
from tensorflow.python.framework import ops
from keras.backend.tensorflow_backend import get_session
import numpy as np


def QPWC_Func(z, sharp):
    s =  np.zeros(z.shape)
    ds = np.zeros(z.shape)

    for m in np.arange(0, len(z)):
        if z[m] <= 0:
            s[m] = 0
            ds[m] = 0
        elif (z[m] > 0) and (z[m] <= 0.25):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.125)/0.25)))
            ds[m] = sharp/0.25 * s[m] * (1-s[m]/0.25)
        elif (z[m] > 0.25) and (z[m] <= 0.5):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.375)/0.25))) + 0.25
            ds[m] = sharp/0.25 * (s[m]-0.25) * (1-(s[m]-0.25)/0.25)
        elif (z[m] > 0.5) and (z[m] <= 0.75):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.625)/0.25))) + 0.5
            ds[m] = sharp/0.25 * (s[m]-0.5) * (1-(s[m]-0.5)/0.25)
        elif (z[m] > 0.75) and (z[m] <= 1):
            # If z is larger than 0.75, the gradient shall be descended to it faster than other cases
            s[m] = 0.5 / (1+np.exp(-sharp*((z[m]-1)/0.5))) + 0.75
            ds[m] = sharp/0.5 * (s[m]-0.75) * (1-(s[m]-0.75)/0.5)
        else:
            s[m] = 1
            ds[m] = 0

    return s

def Derv_QPWC_Func(z, sharp):
    s =  np.zeros(z.shape)
    ds = np.zeros(z.shape)

    for m in np.arange(0, len(z)):
        if z[m] <= 0:
            s[m] = 0
            ds[m] = 0
        elif (z[m] > 0) and (z[m] <= 0.25):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.125)/0.25)))
            ds[m] = sharp/0.25 * s[m] * (1-s[m]/0.25)
        elif (z[m] > 0.25) and (z[m] <= 0.5):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.375)/0.25))) + 0.25
            ds[m] = sharp/0.25 * (s[m]-0.25) * (1-(s[m]-0.25)/0.25)
        elif (z[m] > 0.5) and (z[m] <= 0.75):
            s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.625)/0.25))) + 0.5
            ds[m] = sharp/0.25 * (s[m]-0.5) * (1-(s[m]-0.5)/0.25)
        elif (z[m] > 0.75) and (z[m] <= 1):
            # If z is larger than 0.75, the gradient shall be descended to it faster than other cases
            s[m] = 0.5 / (1+np.exp(-sharp*((z[m]-1)/0.5))) + 0.75
            ds[m] = sharp/0.5 * (s[m]-0.75) * (1-(s[m]-0.75)/0.5)
        else:
            s[m] = 1
            ds[m] = 0


    return ds

QPWC = np.vectorize(QPWC_Func)
Derv_QPWC = np.vectorize(Derv_QPWC_Func)

Derv_QPWC32 = lambda z, sharp: Derv_QPWC_Func(z, sharp).astype(np.float32)

QPWC_32 = lambda z, sharp: QPWC_Func(z, sharp).astype(np.float32)

# tf.py_func acts on lists of tensors (and returns a list of tensors), that is why we have [z, sharp] (and return y[0]).
def tf_QPWC_Fun32(z, sharp, name=None):

    with tf.name_scope(name, "QPWC_Func", [z, sharp]) as name:
        y = py_func(QPWC_32,
                        [z, sharp],
                        [tf.float32],
                        name=name,
                        grad=Derv_QPWC_Func32)  # <-- here's the call to the gradient
        return y[0]

# The stateful option is to tell tensorflow whether the function always gives the same output for the same input (stateful = False) 
def tf_Derv_QPWC_Func32(z, sharp, name=None):
    with tf.name_scope(name, "Derv_QPWC_Func", [z, sharp]) as name:
        y = tf.py_func(Derv_QPWC32,
                        [z, sharp],
                        [tf.float32],
                        name=name,
                        stateful=False)
        return y[0]

# A hack to define gradients of a function using tf.RegisterGradient and tf.Graph.gradient_override_map     
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name) 

def Derv_QPWC_Func32(op, grad):
    z = op.inputs[0]
    sharp = op.inputs[1]

    n_gr = tf_Derv_QPWC_Func32(z, sharp)
    return grad * n_gr    

with tf.Session() as sess:

    x = tf.constant([0.2,0.7,1,0.75])
    y = tf_QPWC_Fun32(x, 100)
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())

I have several questions: 1. As you can see, like Sigmoid, my function can in fact calculate feedforward output and its gradient at the same time. So is there a way in tf to call the function once and once only so that both results are obtained?

I have two input to the custom function, when I ran it, python pops out the following error:


  File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 664, in gradients
    unconnected_gradients)

  File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 972, in _GradientsHelper
    _VerifyGeneratedGradients(in_grads, op)

  File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 335, in _VerifyGeneratedGradients
    "inputs %d" % (len(grads), op.node_def, len(op.inputs)))

ValueError: Num gradients 1 generated for op name: "QPWC_Func_10"
op: "PyFunc"
input: "Const_11"
input: "QPWC_Func_10/input_1"
attr {
  key: "Tin"
  value {
    list {
      type: DT_FLOAT
      type: DT_INT32
    }
  }
}
attr {
  key: "Tout"
  value {
    list {
      type: DT_FLOAT
    }
  }
}
attr {
  key: "_gradient_op_type"
  value {
    s: "PyFuncGrad64499082"
  }
}
attr {
  key: "token"
  value {
    s: "pyfunc_11"
  }
}
 do not match num inputs 2

What does it mean? Where did I make mistakes?

After everything is done, in which file should I put the code so that I will be able to use it? Or shall I write a stand-alone file then import QPWC_Func myself? I am using Keras. So which modules should I import exactly? May I have an example, please?
float32 is used in the tutorial. If I intend to use float16, should I put

            K.set_floatx('float16')
            K.set_epsilon(1e-4)

here? Then use

   QPWC_32 = lambda z, sharp: QPWC_Func(z, sharp).astype(np.float16)

In another thread, Replacing sigmoid activation with custom activation, another way to implement a custom function was given by @Alexandre Passos

def custom_activation_4(x):
  orig = x
  x = tf.where(orig < -6, tf.zeros_like(x), x)
  x = tf.where(orig >= -6 and orig < -4, (0.0078*x + 0.049), x)
  x = tf.where(orig >= -4 and orig < 0, (0.1205*x + 0.5), x)
  x = tf.where(orig >= 0 and orig < 4, (0.1205*x + 0.5), x)
  x = tf.where(orig  >= 4 and orig < 6, (0.0078*x + 0.951), x)
  return tf.where(orig >= 6, 1, x)

I reckon I may also be able to implement mine via that one. Yet that one seems not to provide how to compute gradients. Will tf do it automatically for that form of implementation? If so, for that implementation, in which file should I put the code so that I will be able to use it? And which modules should I import exactly? May I have an example, please?

Many thanks, indeed!

in https://stackoverflow.com/questions/43915482/how-do-you-create-a-custom-activation-function-with-keras is a procedure to generate and and apply a custom activation function in keras — ralf htp, Jun 15 '19 at 08:58
Thanks. I read that. But my custom function contains many things tf's backend functions cannot implement, such as for-loops and multiple control flows. Do you know how to implement them in the backend? — Theron, Jun 15 '19 at 09:15

ralf htp · Answer 1 · 2019-06-15T09:45:42.463

0

you can try a mix of How do you create a custom activation function with Keras? and Replacing sigmoid activation with custom activation :

define the activation function ( rewrite your function without the loops and other flow control ) then apply it :

from keras.layers import Activation
from keras import backend as K
from keras.utils.generic_utils import get_custom_objects

def custom_activation(x):
  orig = x
  x = tf.where(orig < -6, tf.zeros_like(x), x)
  x = tf.where(orig >= -6 and orig < -4, (0.0078*x + 0.049), x)
  x = tf.where(orig >= -4 and orig < 0, (0.1205*x + 0.5), x)
  x = tf.where(orig >= 0 and orig < 4, (0.1205*x + 0.5), x)
  x = tf.where(orig  >= 4 and orig < 6, (0.0078*x + 0.951), x)
  return tf.where(orig >= 6, 1, x)

get_custom_objects().update({'custom_activation': Activation(custom_activation)})

model.add(Activation(custom_activation))

In general the problem with the flow control elements is that they are defined for a single floating point number while you want to apply the activation function to a tensor in TF, what is not defined

edited Jun 15 '19 at 09:45

answered Jun 15 '19 at 09:28

ralf htp

9,149
4
22
34

So I suppose, I shall put my own implementation without the last line of code in your answer into activation.py, then call my custom activation directly in my application. Is that correct? – Theron Jun 15 '19 at 09:46
replace the activation function with your own. i do not know what file structure you use so i can not answer this ... – ralf htp Jun 15 '19 at 09:48
It almost works. Yet my function takes two argument like `def QPWC(x, sharp=1000)`. When I tried to add a layer `x = Activation('QPWC', sharp = 100, name='Decoder_output')(x)`. It says, `TypeError: ('Keyword argument not understood:', 'sharp')`. Why was that? – Theron Jun 15 '19 at 10:04
If I simply write `x = Activation('QPWC', 100, name='Decoder_output')(x)`, another error says `TypeError: __init__() takes 2 positional arguments but 3 were given` – Theron Jun 15 '19 at 10:18
i think you can add your newly defined custom activation to an arbitrary layer like in https://sefiks.com/2018/12/01/using-custom-activation-functions-in-keras/. you get these errors because in python and all other programming languages a function has to get exactly the number of arguments that is defined, in your case 2 ( https://www.python-course.eu/passing_arguments.php ) – ralf htp Jun 15 '19 at 10:31
For the time being, I eliminated sharp from the argument the function, and try to simply run it. yet I was prompted ` raise TypeError("Using a `tf.Tensor` as a Python `bool` is not allowed. " TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.` I tried to replace tf into K, but in backend, there is no `where` function... – Theron Jun 15 '19 at 10:56
I followed into the line where the error happened. `x = tf.where(orig > 0 and orig <= 0.25, 0.25 / (1+K.exp(-sharp*((x-0.125)/0.5))), x)` I test everything, all is tf.Tensor type. I even used https://stackoverflow.com/questions/52435394/replacing-sigmoid-activation-with-custom-activation to test, the same error. – Theron Jun 15 '19 at 11:31
Have to write from smartphone, see the Documentation of tf.where – ralf htp Jun 15 '19 at 12:22
Anyway, I directly built up a class in Keras' advanced_activations.py for the activation. It allows me to pass another argument 'sharp' into the function – Theron Jun 15 '19 at 12:34
Yes, I have read it in the first place. I believe I implemented it correctly. – Theron Jun 15 '19 at 12:45
After struggling on how tf copes with logical operations on tf.Tensors, and my code can run properly now. I found the answer, and pasted here. https://stackoverflow.com/questions/52435394/replacing-sigmoid-activation-with-custom-activation/56610704#56610704 Thanks so much for your great help!!! – Theron Jun 15 '19 at 13:14

May I Learn Some Details about Implementing a Custom Activation Function in Keras?

1 Answers1