25

Imagine a fully-connected neural network with its last two layers of the following structure:

[Dense]
    units = 612
    activation = softplus

[Dense]
    units = 1
    activation = sigmoid

The output value of the net is 1, but I'd like to know what the input x to the sigmoidal function was (must be some high number, since sigm(x) is 1 here).

Folllowing indraforyou's answer I managed to retrieve the output and weights of Keras layers:

outputs = [layer.output for layer in model.layers[-2:]]
functors = [K.function( [model.input]+[K.learning_phase()], [out] ) for out in outputs]

test_input = np.array(...)
layer_outs = [func([test_input, 0.]) for func in functors]

print layer_outs[-1][0]  # -> array([[ 1.]])

dense_0_out = layer_outs[-2][0]                           # shape (612, 1)
dense_1_weights = model.layers[-1].weights[0].get_value() # shape (1, 612)
dense_1_bias = model.layers[-1].weights[1].get_value()

x = np.dot(dense_0_out, dense_1_weights) + dense_1_bias
print x # -> -11.7

How can x be a negative number? In that case the last layers output should be a number closer to 0.0 than 1.0. Are dense_0_out or dense_1_weights the wrong outputs or weights?

johk95
  • 873
  • 10
  • 28
  • 1
    Shouldn't it be `x = np.dot(dense_0_out, dense_1_weights) + dense_1_bias`? – Marcin Możejko Aug 03 '17 at 19:00
  • @MarcinMożejko you're right, I corrected it. Didn't change anything since bias was trained to 0.0. – johk95 Aug 03 '17 at 19:02
  • But the output from this layer is fed to softmax - the value you obtained is then squashed to `[0, 1]` interval. – Marcin Możejko Aug 03 '17 at 19:03
  • @MarcinMożejko you mean the last layer? It is fed to sigmoid, yes. So if the value was -11.7, feed it to sigmoid and obtain some near-zero value. `layer_outs[-1]` says 1 instead... – johk95 Aug 03 '17 at 19:07
  • Ah - shouldn't it be `x = np.dot(dense_1_weights, dense_0_out.transpose())`? – Marcin Możejko Aug 03 '17 at 19:10
  • @MarcinMożejko nope, because `dense_1_weights.shape = (1, 612)` and `dense_0_out.shape = (612, 1)`. To be sure you could do `x = numpy.sum(dense_1_weights.flatten() * dense_0_out.flatten())` and that leads to the same results. – johk95 Aug 03 '17 at 19:20
  • Could you print out a `model.summary()`? – Marcin Możejko Aug 03 '17 at 19:41
  • @MarcinMożejko here: https://www.dropbox.com/s/qiztbj848yq4yqt/Screenshot%202017-08-04%2013.30.05.png?dl=0 As you see, there actually is a Dropout-layer in between that I didn't mention. Its output is the same as Dense_2's, so I thought it would just be too complicated mentioning it! – johk95 Aug 04 '17 at 11:31

6 Answers6

11

Since you're using get_value(), I'll assume that you're using Theano backend. To get the value of the node before the sigmoid activation, you can traverse the computation graph.

The graph can be traversed starting from outputs (the result of some computation) down to its inputs using the owner field.

In your case, what you want is the input x of the sigmoid activation op. The output of the sigmoid op is model.output. Putting these together, the variable x is model.output.owner.inputs[0].

If you print out this value, you'll see Elemwise{add,no_inplace}.0, which is an element-wise addition op. It can be verified from the source code of Dense.call():

def call(self, inputs):
    output = K.dot(inputs, self.kernel)
    if self.use_bias:
        output = K.bias_add(output, self.bias)
    if self.activation is not None:
        output = self.activation(output)
    return output

The input to the activation function is the output of K.bias_add().

With a small modification of your code, you can get the value of the node before activation:

x = model.output.owner.inputs[0]
func = K.function([model.input] + [K.learning_phase()], [x])
print func([test_input, 0.])

For anyone using TensorFlow backend: use x = model.output.op.inputs[0] instead.

Yu-Yang
  • 14,539
  • 2
  • 55
  • 62
  • Thanks for the answer! It's clear to me that your approach is better suited, but could you just briefly comment on my original code... does that calculate something wrong? And why? – johk95 Aug 21 '17 at 20:37
  • Did you tried this approach and it still gives you a negative `x`? I tried your code and it give exact the same result as this approach (a *positive* `x`, around 600), so I'm not quite sure where the problem is in your code. – Yu-Yang Aug 22 '17 at 02:37
  • BTW, I saw `dense_0_out.shape` equals `(1, 612)` and `dense_1_weights.shape` equals `(612, 1)` from my program, which is different from what you posted. Can you provide the `test_input` you used and the version of Keras and TF? – Yu-Yang Aug 22 '17 at 03:10
  • 1
    Oops I mean TH version. I tried the same code on Theano and the results and shapes are still the same. Are you available to post more code (e.g. model definition and fitting)? Maybe the error does not occur within the code block you've posted. – Yu-Yang Aug 22 '17 at 07:18
6

I can see a simple way just changing a little the model structure. (See at the end how to use the existing model and change only the ending).

The advantages of this method are:

  • You don't have to guess if you're doing the right calculations
  • You don't need to care about the dropout layers and how to implement a dropout calculation
  • This is a pure Keras solution (applies to any backend, either Theano or Tensorflow).

There are two possible solutions below:

  • Option 1 - Create a new model from start with the proposed structure
  • Option 2 - Reuse an existing model changing only its ending

Model structure

You could just have the last dense separated in two layers at the end:

[Dense]
    units = 612
    activation = softplus

[Dense]
    units = 1
    #no activation

[Activation]
    activation = sigmoid

Then you simply get the output of the last dense layer.

I'd say you should create two models, one for training, the other for checking this value.

Option 1 - Building the models from the beginning:

from keras.models import Model

#build the initial part of the model the same way you would
#add the Dense layer without an activation:

#if using the functional Model API
    denseOut = Dense(1)(outputFromThePreviousLayer)    
    sigmoidOut = Activation('sigmoid')(denseOut)    

#if using the sequential model - will need the functional API
    model.add(Dense(1))
    sigmoidOut = Activation('sigmoid')(model.output)

Create two models from that, one for training, one for checking the output of dense:

#if using the functional API
    checkingModel = Model(yourInputs, denseOut)

#if using the sequential model:
    checkingModel = model   

trainingModel = Model(checkingModel.inputs, sigmoidOut)   

Use trianingModel for training normally. The two models share weights, so training one is training the other.

Use checkingModel just to see the outputs of the Dense layer, using checkingModel.predict(X)

Option 2 - Building this from an existing model:

from keras.models import Model

#find the softplus dense layer and get its output:
softplusOut = oldModel.layers[indexForSoftplusLayer].output
    #or should this be the output from the dropout? Whichever comes immediately after the last Dense(1)

#recreate the dense layer
outDense = Dense(1, name='newDense', ...)(softPlusOut)

#create the new model
checkingModel = Model(oldModel.inputs,outDense)

It's important, since you created a new Dense layer, to get the weights from the old one:

wgts = oldModel.layers[indexForDense].get_weights()
checkingModel.get_layer('newDense').set_weights(wgts)

In this case, training the old model will not update the last dense layer in the new model, so, let's create a trainingModel:

outSigmoid = Activation('sigmoid')(checkingModel.output)
trainingModel = Model(checkingModel.inputs,outSigmoid)

Use checkingModel for checking the values you want with checkingModel.predict(X). And train the trainingModel.

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
2

So this is for fellow googlers, the working of the keras API has changed significantly since the accepted answer was posted. The working code for extracting a layer's output before activation (for tensorflow backend) is:

model = Your_Keras_Model()
the_tensor_you_need = model.output.op.inputs[0] #<- this is indexable, if there are multiple inputs to this node then you can find it with indexing.

In my case, the final layer was a dense layer with activation softmax, so the tensor output I needed was <tf.Tensor 'predictions/BiasAdd:0' shape=(?, 1000) dtype=float32>.

Manu S Pillai
  • 899
  • 8
  • 13
1

(TF backend) Solution for Conv layers.

I had the same question, and to rewrite a model's configuration was not an option. The simple hack would be to perform the call function manually. It gives control over the activation.

Copy-paste from the Keras source, with self changed to layer. You can do the same with any other layer.

def conv_no_activation(layer, inputs, activation=False):

    if layer.rank == 1:
        outputs = K.conv1d(
            inputs,
            layer.kernel,
            strides=layer.strides[0],
            padding=layer.padding,
            data_format=layer.data_format,
            dilation_rate=layer.dilation_rate[0])
    if layer.rank == 2:
        outputs = K.conv2d(
            inputs,
            layer.kernel,
            strides=layer.strides,
            padding=layer.padding,
            data_format=layer.data_format,
            dilation_rate=layer.dilation_rate)
    if layer.rank == 3:
        outputs = K.conv3d(
            inputs,
            layer.kernel,
            strides=layer.strides,
            padding=layer.padding,
            data_format=layer.data_format,
            dilation_rate=layer.dilation_rate)

    if layer.use_bias:
        outputs = K.bias_add(
            outputs,
            layer.bias,
            data_format=layer.data_format)

    if activation and layer.activation is not None:
        outputs = layer.activation(outputs)

    return outputs

Now we need to modify the main function a little. First, identify the layer by its name. Then retrieve activations from the previous layer. And at last, compute the output from the target layer.

def get_output_activation_control(model, images, layername, activation=False):
    """Get activations for the input from specified layer"""

    inp = model.input

    layer_id, layer = [(n, l) for n, l in enumerate(model.layers) if l.name == layername][0]
    prev_layer = model.layers[layer_id - 1]
    conv_out = conv_no_activation(layer, prev_layer.output, activation=activation)
    functor = K.function([inp] + [K.learning_phase()], [conv_out]) 

    return functor([images]) 

Here is a tiny test. I'm using VGG16 model.

a_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=True)[0]
a_no_relu = get_output_activation_control(vgg_model, img, 'block4_conv1', activation=False)[0]

print(np.sum(a_no_relu < 0))
> 245293

Set all negatives to zero to compare with the results retrieved after an embedded in VGG16 ReLu operation.

a_no_relu[a_no_relu < 0] = 0
print(np.allclose(a_relu, a_no_relu))
> True
Katerina
  • 2,580
  • 1
  • 22
  • 25
1

easy way to define new layer with new activation function:

def change_layer_activation(layer):

    if isinstance(layer, keras.layers.Conv2D):

        config = layer.get_config()
        config["activation"] = "linear"
        new = keras.layers.Conv2D.from_config(config)

    elif isinstance(layer, keras.layers.Dense):

        config = layer.get_config()
        config["activation"] = "linear"
        new = keras.layers.Dense.from_config(config)

    weights = [x.numpy() for x in layer.weights]

    return new, weights
Red One
  • 107
  • 1
  • 5
0

I had the same problem but none of the other answers worked for me. Im using a newer version of Keras with Tensorflow so some answers dont work now. Also the structure of the model is given so i can't change it easely. The general idea is to create a copy of the original model that will work exactly like the original one but spliting the activation from the outputs layers. Once this is done we can easely access the outputs values before the activation is applied.

First we will create a copy of the original model but with no activation on the outputs layers. This will be done using Keras clone_model function (See Docs).

from tensorflow.keras.models import clone_model
from tensorflow.keras.layers import Activation

original_model = get_model()

def f(layer):
  config = layer.get_config()
  if not isinstance(layer, Activation) and layer.name in original_model.output_names:
    config.pop('activation', None)
  layer_copy = layer.__class__.from_config(config)
  return layer_copy

copy_model = clone_model(model, clone_function=f)  

This alone will only make a clone with new weights so we must copy the original_model weights to the new one:

copy_model.build(original_model.input_shape)
copy_model.set_weights(original_model.get_weights())

Now we will add the activations layers:

from tensorflow.keras.models import Model

old_outputs = [ original_model.get_layer(name=name) for name in copy_model.output_names ]
new_outputs = [ Activation(old_output.activation)(output) if old_output.activation else output 
                for output, old_output in zip(copy_model.outputs, old_outputs) ]
copy_model = Model(copy_model.inputs, new_outputs)

Finally we could create a new model whose evaluation will be the outputs with no activation applied:

no_activation_outputs = [ copy_model.get_layer(name=name).output for name in original_model.output_names ]
no_activation_model = Model(copy.inputs, no_activation_outputs)

Now we could use copy_model like the original_model and no_activation_model to access pre-activation outputs. Actually you could even modify the code to split a custom set of layers instead of the outputs.