1

Usually we feed a model for training with external data. But I would like to use tensor coming from intermediate layer of the same model as an input for next batch. I believe that this can be acheived by using manual loop for training. This time, I prefer to use fit_generator() from Keras (v2.2.4). I create a mode using Functional API.

Any help are appreciated. Thanks.

donto
  • 141
  • 1
  • 12

2 Answers2

0

A very simple approach is to make the loop inside your own model:

inputs = Input(...)

#part 1 layers:
layer1 = SomeLayer(...)
layer2 = SomeLayer(...)
layer3 = SomeLayer(...)
intermediateLayer = IntermediateLayer(...)

#first pass:
out = layer1(inputs)
out = layer2(out)
out = layer3(out)
intermediate_out = intermediateLayer(out)

#second pass:
out = layer1(intermediate_out)
out = layer2(out)
out = layer3(out)
second_pass_out = intermediateLayer(out)

#rest of the model - you decide wheter you need the first pass or only the second
out = SomeLayer(...)(second_pass_out)
out = SomeLayer(...)(out)
...
final_out = FinalLayer(...)(out)

The model then goes:

model = Model(inputs, final_out)

You can, depending on your purposes, make only the second pass participate in training, blocking gradients from the first pass.

#right after intermediate_out, before using it 
intermediate_out = Lambda(lambda x: K.stop_gradients(x))(intermediate_out)

You can also create more models that will share these layers, and use each model for a purpose while they will always be updated together (as they use the same layers).

Notice that in "part 1", there are layers that get "reused".
While in "rest of the model" the layers are not "reused", if for some reason you need to reuse the layers for the second part, you should do it the same way it was done for "part 1".

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • I actually try to implement this source (https://github.com/dtransposed/Paper-Implementation/blob/eed841fcbfd94bcc19de727164877e12a75be4dc/action_recognition_using_visual_attention/utils.py#L180) using easier approach. You can see that `hidden_state` is used again. I am quite unsure whether your solution will solve my problem. I read from Keras docs that it has train_on_batch() and I wonder if this function will fit my expectation – donto Jan 21 '20 at 15:13
0

This is how I solve my problem.

model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
model.metrics_tensors =+ [self.model.get_layer('your_intermediate_layer').output]   # This line is to access the output of a layer during training (what I want)

Then train like this:

loss_out, ...., your_intermediate_layer_out = model.train_on_batch(X, y)

your_intermediate_layer_out is a numpy array I am looking for during model's training.

donto
  • 141
  • 1
  • 12
  • Can you please have a look here https://stackoverflow.com/questions/67985962/lstm-auto-encoder-use-first-lstm-output-as-the-target-for-the-decoder ? Maybe it's related to what you solved, will appreciate your inputs. – Shlomi Schwartz Jun 15 '21 at 12:19