This is generally about using tf.numpy_function in writing custom loss function. Where gradient of custom loss function must be provided manually as well.
Assume that a custom loss function with extra input arguments is written.
def custom_loss(self, targetPosition):
def loss(y_true, y_pred):
loss_ = tf.numpy_function(self.forAllWrapped, [y_pred, targetPosition], tf.float32)
return loss_;
return loss
where loss function is written in a tf.numpy_function wrapper which gets the tensor values as Numpy arrays and sends it through GRPC to some C# code to calculate the loss.
def forAllWrapped(self, y_pred, targetPosition):
result= np.empty((0,targetPosition.shape[1]), np.float)
i=0
numFKs = y_pred.shape[0]
while i < numFKs :
res_ =sres_=GET-INDIVIDUAL-LOSS-FROM-GRPC-METHOD(y_pred[i,],targetPosition[i,])
result = np.append(result, self.trimResultFK(res_) , axis=0)
i=i+1
loss__ = K.mean(K.square(targetPosition - result), axis=1)
return loss__.astype(tf.float32)
The loss calculation part works perfectly. But when I am gonna use my custom loss function, I am getting an error:
Variable xxxxxx has
None
for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Due to usage of non-Tensorflow functions to calculate the loss value, then gradient is unknown to the optimizer and it cannot automatically derive it using Automatic differentiation and gradient. To solve this problem, I tried to implement custom gradient and attach it to my custom loss function based on this nice answer. Here is the corresponding parts which attach the gradient using my version of self.numpy_function method:
def numpy_function(self, func, inp, dtype=tf.float32, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad)
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.numpy_function(func, inp, dtype)
I have now 3 methods to override the gradient. First gradient is calculated in numpy domain (dummy outputs 1 and 0 for y_pred, targetPosition with the same shape as their numpy array shape):
def np_d_customGrad(self, y_pred, targetPosition):
return np.ones((self.batch_size,2)).astype(np.float32),np.zeros((self.batch_size ,3 )).astype(np.float32)
Then it is wrapped for Tensorflow:
def tf_customGrad(self, y_pred, targetPosition):
d_y_pred_, d_targetPosition_ = tf.numpy_function(self.np_d_customGrad,[y_pred, targetPosition],[tf.float32,tf.float32])
return d_y_pred_, d_targetPosition_
and finally used to calculate gradient for two inputs of our custom loss function (y_pred,and targetPosition):
def customGrad(self, op, grad):
y_pred = op.inputs[0]
targetPosition = op.inputs[1]
d_y_pred, d_targetPosition = self.tf_customGrad(y_pred,targetPosition)
return grad * d_y_pred , grad * d_targetPosition
This results in dimension error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1000,2] vs. [1000]
I have tried to return different shapes of zeros and ones in np_d_customGrad function. but it does not work. I assume that my function must return two gradient one per input parameters of custom loss function. (Please correct me if I am wrong)
I have also tried to return the numpy array result instead of loss__ values and use
return K.mean(K.square(targetPosition - actualPosition__), axis=1)
in custom_loss hoping that Tensorflow derives the gradient itself, but it could not.
Please note that I cannot implement the GET-INDIVIDUAL-LOSS-FROM-GRPC-METHOD method in python using TF as it takes months of work due to complexity.