0

What I'm trying to do

I'm currently making a very simple sequence-to-sequence LSTM using Keras with a minor twist, earlier predictions in the sequence should count against the loss less than later ones. The way I'm trying to do this is by counting the sequence number and multiplying by the square root of this count. (I want to do this because this value is representative of the relative ratio of uncertainty in a Poisson process based on the number of samples collected. My network is gathering data and attempting to estimate an invariant value based on the data gathered so far.)

How I'm trying to do it

I've implemented both a custom loss function and a custom layer.

Loss function:

def loss_function(y_true, y_pred):
    # extract_output essentially concatenates the first three regression outputs of y
    # into a list representing an [x, y, z] vector, and returns it along with the rest as a tuple
    r, e, n = extract_output(y_true)
    r_h, e_h, n_h = extract_output(y_pred)

    # Hyperperameters
    dir_loss_weight = 10
    dist_loss_weight = 1
    energy_loss_weight = 3

    norm_r = sqrt(dot(r, r))
    norm_r_h = sqrt(dot(r_h, r_h))

    dir_loss = mean_squared_error(r/norm_r, r_h/norm_r_h)
    dist_loss = mean_squared_error(norm_r, norm_r_h)
    energy_loss = mean_squared_error(e, e_h)

    return sqrt(n) * (dir_loss_weight * dir_loss + dist_lost_weight * dist_loss + energy_loss_weight * energy_loss)

Custom Layer:

class CounterLayer(Layer):
    def __init__(self, **kwargs):
        super(CounterLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        self.sequence_number = 0
        pass

    def call(self, x):
        self.sequence_number += 1
        return [self.sequence_number]

    def compute_output_shape(self, input_shape):
        return (1,)

I then added the input as a concatenation to the regular output:

seq_num = CounterLayer()(inputs)
outputs = concatenate([out, seq_num])

What's going wrong

My error is:

Traceback (most recent call last):
  File "lstm.py", line 119, in <module>
    main()
  File "lstm.py", line 115, in main
    model = create_model()
  File "lstm.py", line 74, in create_model
    seq_num = CounterLayer()(inputs)
  File "/usr/lib/python3.7/site-packages/keras/engine/base_layer.py", line 497, in __call__
    arguments=user_kwargs)
  File "/usr/lib/python3.7/site-packages/keras/engine/base_layer.py", line 565, in _add_inbound_node
    output_tensors[i]._keras_shape = output_shapes[i]
AttributeError: 'int' object has no attribute '_keras_shape'

I'm assuming I have the shape wrong. But I do not know how. Does anyone know if I'm going about this in the wrong way? What should I do to make this happen?

Further Adventures

Per @Mohammad Jafar Mashhadi's comment, my call return needed to be wrapped in a keras.backend.variable; however, per his linked answer, my approach will not work, because call is not called multiple times, as I initially assumed it was.

How can I get a counter for the RNN?

For clarity, if the RNN given input xi outputs yi, I'm trying to get i as part of my output.

x1 -> RNN -> (y1, 1)
    h1 |
       v
x2 -> RNN -> (y2, 2)
    h2 |
       v
x3 -> RNN -> (y3, 3)
    h3 |
       v
x4 -> RNN -> (y4, 4)
OmnipotentEntity
  • 16,531
  • 6
  • 62
  • 96
  • 1
    I don't know the answer to the whole question, but I think there is a bug in your custom layer definition. You need to define the sequence number as a TensorFlow variable, a normal python variable doesn't work. Check out my answer here: https://stackoverflow.com/questions/60589400/how-to-create-a-custom-layer-in-keras-with-stateful-variables-tensors/60591238#60591238 – Mohammad Jafar Mashhadi Mar 22 '20 at 16:45
  • Thanks, changing into a tensorflow variable fixed the compile error. But from your answer, my approach will not work, because `call` is not called multiple times, as I was assuming. Is there a way to get a sequence counter, do you know? – OmnipotentEntity Mar 22 '20 at 18:11

2 Answers2

1

The error is saying that inputs, in the line seq_num = CounterLayer()(inputs), is an integer.

You can't pass integers as inputs to layers. You must pass keras tensors, and only keras tensors.


Second, this will not work because Keras works in a static graph style. A call in a layer doesn't calculate things, it only builds the graph of empty tensors. Only tensors will ever get updated as you pass data to them, integer values will not. When you say self.sequence_number += 1, it will be called only when building the model, it will not be called over and over.


We need details

We can't really understand what is going on if you don't give us enough information, such as:

  • the model
  • the summary
  • your input data shapes
  • the target data shapes
  • the custom functions
  • etc.

If the interpretation below is correct, the model's output shape in the summary and the target data shapes as you pass it to fit are absolutely important to know.


Proposed solution

If I understood what you described, you want to have a sequence of increasing integers along with the time steps of your sequences, so these numbers are used in your loss function.

If this interpretation is right, you don't need to keep updating numbers, you just need to create a range tensor and that's it.

So, inside your loss (which I don't understand unless you provide us with the custom functions) you should create this tensor:

def custom_loss(y_true, y_pred):
    #use this if you defined your model with static sequence length - input_shape =(length, features)    
    length = K.int_shape(y_pred)[1]

    #use this if you defined your model with dynamic sequence length - input_shape = (None, features)
    length = K.shape(y_pred)[1]

    #this is the sequence vector:
    seq = tf.range(1, length+1)

    #you can get the root with 
    sec = K.sqrt(seq)

    #you reshape to match the shape of the loss, which is probably (batch, length)
    sec = K.reshape(sec, (1,-1)) #shape (1, lenght)

    #compute your loss normally, taking care to reduce the last axis and keep the two first
    loss = ..... #shape (batch, length)

    #multiply the weights by the loss
    return loss * sec

You must work everything as a whole tensor! You cannot interpret it as step by step. You must do everything keeping the first and second dimensions in you loss.

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • I'm sorry for being frustratingly vague. It wasn't intentional. This pointed me in the right direction, and while I'm still working on it, I have confidence that I know enough to work out the correct method now. I will award the bounty when it comes available to award. – OmnipotentEntity Mar 23 '20 at 23:13
0

I'm not sure to have understood the question completely, but based on the final draw, I think that to get an extra feature (the time step) fed into the loss function along with the predictions, you might try to use the second approach suggested in this other accepted answer: Custom loss function in Keras based on the input data

The idea is to expand the label vector with the extra feature, and then separate them again inside the loss function.

# y_true_plus_timesteps has shape [n_training_instances, 2]

def custom_loss(y_true_plus_timesteps, y_pred):

    # labels stored in the first column
    y_true = y_true_plus_timesteps[:, 0]


    time_steps = y_true_plus_timesteps[:, 1]

    return K.mean(K.square(y_pred - y_true), axis=-1) +  your loss function

# note that labels are fed into the model with time steps 
model.fit(X, np.append(Y_true, time_steps, axis =1), batch_size = batch_size, epochs=90, shuffle=True, verbose=1)
Edoardo Guerriero
  • 1,210
  • 7
  • 16