Take output of intermediate layer as input for model training

Question

Usually we feed a model for training with external data. But I would like to use tensor coming from intermediate layer of the same model as an input for next batch. I believe that this can be acheived by using manual loop for training. This time, I prefer to use fit_generator() from Keras (v2.2.4). I create a mode using Functional API.

Any help are appreciated. Thanks.

Is this "always"? You will always want that the intermediate layer goes again to the start? — Daniel Möller, Jan 21 '20 at 10:26
@DanielMöller for training: yes. But, I dont need that output for prediction/evalute — donto, Jan 21 '20 at 11:06

score 0 · Answer 1 · answered Jan 21 '20 at 13:14

A very simple approach is to make the loop inside your own model:

inputs = Input(...)

#part 1 layers:
layer1 = SomeLayer(...)
layer2 = SomeLayer(...)
layer3 = SomeLayer(...)
intermediateLayer = IntermediateLayer(...)

#first pass:
out = layer1(inputs)
out = layer2(out)
out = layer3(out)
intermediate_out = intermediateLayer(out)

#second pass:
out = layer1(intermediate_out)
out = layer2(out)
out = layer3(out)
second_pass_out = intermediateLayer(out)

#rest of the model - you decide wheter you need the first pass or only the second
out = SomeLayer(...)(second_pass_out)
out = SomeLayer(...)(out)
...
final_out = FinalLayer(...)(out)

The model then goes:

model = Model(inputs, final_out)

You can, depending on your purposes, make only the second pass participate in training, blocking gradients from the first pass.

#right after intermediate_out, before using it 
intermediate_out = Lambda(lambda x: K.stop_gradients(x))(intermediate_out)

You can also create more models that will share these layers, and use each model for a purpose while they will always be updated together (as they use the same layers).

Notice that in "part 1", there are layers that get "reused".
While in "rest of the model" the layers are not "reused", if for some reason you need to reuse the layers for the second part, you should do it the same way it was done for "part 1".

I actually try to implement this source (https://github.com/dtransposed/Paper-Implementation/blob/eed841fcbfd94bcc19de727164877e12a75be4dc/action_recognition_using_visual_attention/utils.py#L180) using easier approach. You can see that `hidden_state` is used again. I am quite unsure whether your solution will solve my problem. I read from Keras docs that it has train_on_batch() and I wonder if this function will fit my expectation — donto, Jan 21 '20 at 15:13

score 0 · Accepted Answer · answered Jan 29 '20 at 08:54

0

This is how I solve my problem.

model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
model.metrics_tensors =+ [self.model.get_layer('your_intermediate_layer').output]   # This line is to access the output of a layer during training (what I want)

Then train like this:

loss_out, ...., your_intermediate_layer_out = model.train_on_batch(X, y)

your_intermediate_layer_out is a numpy array I am looking for during model's training.

answered Jan 29 '20 at 08:54

donto

141
1
12

Can you please have a look here https://stackoverflow.com/questions/67985962/lstm-auto-encoder-use-first-lstm-output-as-the-target-for-the-decoder ? Maybe it's related to what you solved, will appreciate your inputs. – Shlomi Schwartz Jun 15 '21 at 12:19

Take output of intermediate layer as input for model training

2 Answers2