@patapouf_ai
Relating to How to make a custom activation function with only Python in Tensorflow?
I am a newcomer to Python, keras, and tf. I implemented a piece-wise constant custom activation function using the method above as follows
import tensorflow as tf
from tensorflow.python.framework import ops
from keras.backend.tensorflow_backend import get_session
import numpy as np
def QPWC_Func(z, sharp):
s = np.zeros(z.shape)
ds = np.zeros(z.shape)
for m in np.arange(0, len(z)):
if z[m] <= 0:
s[m] = 0
ds[m] = 0
elif (z[m] > 0) and (z[m] <= 0.25):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.125)/0.25)))
ds[m] = sharp/0.25 * s[m] * (1-s[m]/0.25)
elif (z[m] > 0.25) and (z[m] <= 0.5):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.375)/0.25))) + 0.25
ds[m] = sharp/0.25 * (s[m]-0.25) * (1-(s[m]-0.25)/0.25)
elif (z[m] > 0.5) and (z[m] <= 0.75):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.625)/0.25))) + 0.5
ds[m] = sharp/0.25 * (s[m]-0.5) * (1-(s[m]-0.5)/0.25)
elif (z[m] > 0.75) and (z[m] <= 1):
# If z is larger than 0.75, the gradient shall be descended to it faster than other cases
s[m] = 0.5 / (1+np.exp(-sharp*((z[m]-1)/0.5))) + 0.75
ds[m] = sharp/0.5 * (s[m]-0.75) * (1-(s[m]-0.75)/0.5)
else:
s[m] = 1
ds[m] = 0
return s
def Derv_QPWC_Func(z, sharp):
s = np.zeros(z.shape)
ds = np.zeros(z.shape)
for m in np.arange(0, len(z)):
if z[m] <= 0:
s[m] = 0
ds[m] = 0
elif (z[m] > 0) and (z[m] <= 0.25):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.125)/0.25)))
ds[m] = sharp/0.25 * s[m] * (1-s[m]/0.25)
elif (z[m] > 0.25) and (z[m] <= 0.5):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.375)/0.25))) + 0.25
ds[m] = sharp/0.25 * (s[m]-0.25) * (1-(s[m]-0.25)/0.25)
elif (z[m] > 0.5) and (z[m] <= 0.75):
s[m] = 0.25 / (1+np.exp(-sharp*((z[m]-0.625)/0.25))) + 0.5
ds[m] = sharp/0.25 * (s[m]-0.5) * (1-(s[m]-0.5)/0.25)
elif (z[m] > 0.75) and (z[m] <= 1):
# If z is larger than 0.75, the gradient shall be descended to it faster than other cases
s[m] = 0.5 / (1+np.exp(-sharp*((z[m]-1)/0.5))) + 0.75
ds[m] = sharp/0.5 * (s[m]-0.75) * (1-(s[m]-0.75)/0.5)
else:
s[m] = 1
ds[m] = 0
return ds
QPWC = np.vectorize(QPWC_Func)
Derv_QPWC = np.vectorize(Derv_QPWC_Func)
Derv_QPWC32 = lambda z, sharp: Derv_QPWC_Func(z, sharp).astype(np.float32)
QPWC_32 = lambda z, sharp: QPWC_Func(z, sharp).astype(np.float32)
# tf.py_func acts on lists of tensors (and returns a list of tensors), that is why we have [z, sharp] (and return y[0]).
def tf_QPWC_Fun32(z, sharp, name=None):
with tf.name_scope(name, "QPWC_Func", [z, sharp]) as name:
y = py_func(QPWC_32,
[z, sharp],
[tf.float32],
name=name,
grad=Derv_QPWC_Func32) # <-- here's the call to the gradient
return y[0]
# The stateful option is to tell tensorflow whether the function always gives the same output for the same input (stateful = False)
def tf_Derv_QPWC_Func32(z, sharp, name=None):
with tf.name_scope(name, "Derv_QPWC_Func", [z, sharp]) as name:
y = tf.py_func(Derv_QPWC32,
[z, sharp],
[tf.float32],
name=name,
stateful=False)
return y[0]
# A hack to define gradients of a function using tf.RegisterGradient and tf.Graph.gradient_override_map
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
def Derv_QPWC_Func32(op, grad):
z = op.inputs[0]
sharp = op.inputs[1]
n_gr = tf_Derv_QPWC_Func32(z, sharp)
return grad * n_gr
with tf.Session() as sess:
x = tf.constant([0.2,0.7,1,0.75])
y = tf_QPWC_Fun32(x, 100)
tf.initialize_all_variables().run()
print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())
I have several questions: 1. As you can see, like Sigmoid, my function can in fact calculate feedforward output and its gradient at the same time. So is there a way in tf to call the function once and once only so that both results are obtained?
- I have two input to the custom function, when I ran it, python pops out the following error:
File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 664, in gradients
unconnected_gradients)
File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 972, in _GradientsHelper
_VerifyGeneratedGradients(in_grads, op)
File "D:\TProgramFiles\Anaconda3\envs\keras-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 335, in _VerifyGeneratedGradients
"inputs %d" % (len(grads), op.node_def, len(op.inputs)))
ValueError: Num gradients 1 generated for op name: "QPWC_Func_10"
op: "PyFunc"
input: "Const_11"
input: "QPWC_Func_10/input_1"
attr {
key: "Tin"
value {
list {
type: DT_FLOAT
type: DT_INT32
}
}
}
attr {
key: "Tout"
value {
list {
type: DT_FLOAT
}
}
}
attr {
key: "_gradient_op_type"
value {
s: "PyFuncGrad64499082"
}
}
attr {
key: "token"
value {
s: "pyfunc_11"
}
}
do not match num inputs 2
What does it mean? Where did I make mistakes?
After everything is done, in which file should I put the code so that I will be able to use it? Or shall I write a stand-alone file then import QPWC_Func myself? I am using Keras. So which modules should I import exactly? May I have an example, please?
float32 is used in the tutorial. If I intend to use float16, should I put
K.set_floatx('float16')
K.set_epsilon(1e-4)
here? Then use
QPWC_32 = lambda z, sharp: QPWC_Func(z, sharp).astype(np.float16)
- In another thread, Replacing sigmoid activation with custom activation, another way to implement a custom function was given by @Alexandre Passos
def custom_activation_4(x):
orig = x
x = tf.where(orig < -6, tf.zeros_like(x), x)
x = tf.where(orig >= -6 and orig < -4, (0.0078*x + 0.049), x)
x = tf.where(orig >= -4 and orig < 0, (0.1205*x + 0.5), x)
x = tf.where(orig >= 0 and orig < 4, (0.1205*x + 0.5), x)
x = tf.where(orig >= 4 and orig < 6, (0.0078*x + 0.951), x)
return tf.where(orig >= 6, 1, x)
I reckon I may also be able to implement mine via that one. Yet that one seems not to provide how to compute gradients. Will tf do it automatically for that form of implementation? If so, for that implementation, in which file should I put the code so that I will be able to use it? And which modules should I import exactly? May I have an example, please?
Many thanks, indeed!