Custom activation for subset of Keras layer?

Question

I want to apply linear activation to most of my Keras model's output layer, and sigmoid activation to a set of "columns" which are interleaved with the other data in the tensor.

Following this post on writing custom activations and @jdehesa's answer in this post on sliced assignment, and this other post about sliced assignment, I wrote the following:

from keras.layers import Activation
from keras import backend as K
from keras.utils.generic_utils import get_custom_objects
import tensorflow as tf

def selective_activation(x, start=0, end=None, skip_every=6):
    with tf.control_dependencies(x[:,start:end:skip_every].assign(K.sigmoid(x[:,start:end:skip_every]))):
        x = tf.identity(x)
    return x

model = Sequential() 
model.add(...bunch of layers...)
model.add(Dense(),name="Final Layer")
get_custom_objects().update({'selective_activation': Activation(selective_activation)})
model.add(Activation(selective_activation))
...

When I run this I get the error "ValueError: Sliced assignment is only supported for variables" on the line with the tf.control_dependencies context. I'm confused: how is my Keras layer output NOT a Variable?

Can someone suggest a way to implement the sort of assignment I'm trying to do?

I'm only imagining three solutions:

My currently-implemented workaround is to create two different output layers using the functional API, give each its own activation, then concatenate them together, and then multiply by a 'permutation matrix' (a bunch of 0's and 1's) to reorder the columns so that they end up where the rest of the code is expecting variables to be (i.e. interleaved with the other linearly-activate variables). But this seems like an overly complex, verbose hack. (No need to submit an answer implementing this; I've already got it but I don't like it.)
Cook something up with tf.scatter_nd() or tf.scatter_update()...somehow?
The other option I can think of, i.e. rewriting everything else in the rest of the code to keep the 'existence' variables bunched together instead of interleaved with the other variables...that would be a lot of work I'm not eager to embark on.

(This is for an object detector by the way, which previously was using MSE loss for all variables, and now I want to have cross-entropy loss for the 'does an object exist' category.)

Maybe something like [this post using tf.where()](https://stackoverflow.com/questions/45705771/how-to-use-activation-function-on-part-neurons-from-one-layer-in-tensorflow)? Except tf.where() expects the indices matrix to be fully specified, but in Keras layers they never tell you the batch size, so you don't know the first dimension of the shape to define a complete array of indices... — sh37211, Dec 08 '18 at 01:17

sh37211 · Answer 1 · 2018-12-08T02:26:34.720

EDIT: The documentation for tf.where() includes the line

"If condition is rank 1, x may have higher rank, but its first dimension must match the size of condition"

...so, if we just take the transpose of x, the we can use a one-dimensional array of indices and not worry about whether we know the batch size, like this:

class SelectiveSigmoid(Layer):
    def __init__(self, **kwargs):
        self.start = kwargs.get('start', 0)
        self.end = kwargs.get('end', None)
        self.skip = kwargs.get('skip', 6)
        super(SelectiveSigmoid, self).__init__(**kwargs)

    def build(self, input_shape):
        self.indices = np.zeros(input_shape[-1]) # Note that tf.cast allows a numpy array!
        self.indices[self.start:self.end:self.skip] = 1

    def call(self, x):
        print("In call.... ")
        print("x.get_shape().as_list()[0] = ",tf.shape(x)[0])
        return tf.transpose(tf.where(tf.cast(self.indices, dtype=tf.bool), K.sigmoid(tf.transpose(x)), tf.transpose(x)))

    def compute_output_shape(self, input_shape):
        return input_shape

...then this will run with no errors, and we don't need to know the batch size up front.

This seems to do the job, but I'd gladly vote for something better that someone else might come up with!

Custom activation for subset of Keras layer?

1 Answers1