I am trying to implement Levenberg-Marquardt algorithm as a Keras optimizer as was described here but I have several problems, biggest one is with this error
TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.
After quick search I have found out this is connected to how tensorflow is running programs with graphs which I don't understand in details.I have found this answer useful from SO but its about loss function, not optimizer.
So to the point.
My attempt looks like this:
from keras.optimizers import Optimizer
from keras.legacy import interfaces
from keras import backend as K
class Leveberg_Marquardt(Optimizer):
def __init__(self, tau =1e-2 , lambda_1=1e-5, lambda_2=1e+2, **kwargs):
super(Leveberg_Marquardt, self).__init__(**kwargs)
with K.name_scope(self.__class__.__name__):
self.iterations = K.variable(0, dtype='int64', name='iterations')
self.tau = K.variable(tau,name ='tau')
self.lambda_1 = K.variable(lambda_1,name='lambda_1')
self.lambda_2 = K.variable(lambda_2,name='lambda_2')
@interfaces.legacy_get_updates_support
def get_updates(self, loss, params):
grads = self.get_gradients(loss,params)
self.updates = [K.update_add(self.iterations,1)]
error = [K.int_shape(m) for m in loss]
for p,g,err in zip(params,grads,error):
H = K.dot(g, K.transpose(g)) + self.tau * K.eye(K.max(g))
w = p - K.pow(H,-1) * K.dot(K.transpose(g),err) #ended at step 3 from http://mads.lanl.gov/presentations/Leif_LM_presentation_m.pdf
if self.tau > self.lambda_2:
w = w - 1/self.tau * err
if self.tau < self.lambda_1:
w = w - K.pow(H,-1) * err
# Apply constraints.
if getattr(p, 'constraint', None) is not None:
w = p.constraint(w)
self.updates.append(K.update_add(err, w))
return self.updates
def get_config(self):
config = {'tau':float(K.get_value(self.tau)),
'lambda_1':float(K.get_value(self.lambda_1)),
'lambda_2':float(K.get_value(self.lambda_2)),}
base_config = super(Leveberg_Marquardt, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
Q1 Can I fix this error without going deep into tensorflow (I wish I could do this by staying on Keras level)
Q2 Do I use keras backend in correct way?
I mean, in this line
H = K.dot(g, K.transpose(g)) + self.tau * K.eye(K.max(g))
I should use keras backend function, or numpy or pure python in order to run this code without problem that input data are numpy arrays?
Q3 This question is more about the algorith itself.
Do I even implement LMA correctly? I'm must say, I not sure how to deal with boundry conditions, tau/lambda values I have guessed, maybe you know better way?
I was trying to understand how every other optimizer in keras works, but even SGD code looks ambiguous to me.
Q4 Do I need to change in any way local file optimizers.py?
In order to run it properly I was initializing my optimizer with:
myOpt = Leveberg_Marquardt()
and then simply pass it to complie method. Yet after quick look at source code of optimizers.py I have found thera are places in code with explicity writted names of optimizers (e.g deserialize function). Is it important to extend this for my custom optimizer or I can leave it be?
I would really appreciate any help and direction of future actions.