0

Is there a way to update a subset of parameters in dynet? For instance in the following toy example, first update h1, then h2:

 model = ParameterCollection()
 h1 = model.add_parameters((hidden_units, dims))
 h2 = model.add_parameters((hidden_units, dims))
 ...
 for x in trainset:
    ...
    loss.scalar_value()
    loss.backward()
    trainer.update(h1)
    renew_cg()

 for x in trainset:
    ...
    loss.scalar_value()
    loss.backward()
    trainer.update(h2)
    renew_cg()

I know that update_subset interface exists for this and works based on the given parameter indexes. But then it is not documented anywhere how we can get the parameter indexes in dynet Python.

user3639557
  • 4,791
  • 6
  • 30
  • 55

1 Answers1

1

A solution is to use the flag update = False when creating expressions for parameters (including lookup parameters):

import dynet as dy
import numpy as np

model = dy.Model()
pW = model.add_parameters((2, 4))
pb = model.add_parameters(2)
trainer = dy.SimpleSGDTrainer(model)

def step(update_b):
    dy.renew_cg()
    x = dy.inputTensor(np.ones(4))
    W = pW.expr()
    # update b?
    b = pb.expr(update = update_b)

    loss = dy.pickneglogsoftmax(W * x + b, 0)
    loss.backward()
    trainer.update()
    # dy.renew_cg()

print(pb.as_array())
print(pW.as_array())
step(True)
print(pb.as_array()) # b updated
print(pW.as_array())
step(False)     
print(pb.as_array()) # b not updated
print(pW.as_array())
  • For update_subset, I would guess that the indices are the integers suffixed at the end of parameter names (.name()). In the doc, we are supposed to use a get_index function.
  • Another option is: dy.nobackprop() which prevents the gradient to propagate beyond a certain node in the graph.
  • And yet another option is to zero the gradient of the parameter that do not need to be updated (.scale_gradient(0)).

These methods are equivalent to zeroing the gradient before the update. So, the parameter will still be updated if the optimizer uses its momentum from previous training steps (MomentumSGDTrainer, AdamTrainer, ...).

mcoav
  • 1,586
  • 8
  • 11
  • Hmm ... something weird is going on: If you change the order of the steps, to first be step(False), then step(True), it will give an error ... – user3639557 Jun 06 '18 at 16:17
  • 1
    It seems that using `.expr(update=False)` returns `None` before the first call to `dy.renew_cg()`. I moved the call to `dy.renew_cg()` at the beginning of the loop to solve this. I am not sure if it is a bug or not... I thought the first CG was initialized automatically. – mcoav Jun 07 '18 at 07:28
  • I realized it does, so removed the comment :) - in the initial_state you can set update=True/False. See http://dynet.readthedocs.io/en/latest/python_ref.html#dynet._RNNBuilder.initial_state – user3639557 Jun 07 '18 at 13:37