3

I am trying to run a wavelet reconstruction dataset through a sequential Keras neural network. In order to get better results from the training, I am trying to construct a custom loss function that only focuses on certain indices of the waveform. I intend to create a neural network that will interpolate clipped waveforms, so I just want the neural network to calculate loss by comparing the clipped segments of the waveform to the actual output.

I have already tried creating a wrapper for my custom loss function so that I can pass in an additional inputs parameter. I then use this inputs parameter to find the indices of the clipped datapoints and attempt to gather those indices from y_pred and y_true.

This is where the model is instantiated and trained:

x_train, x_test, y_train, y_test = train_test_split(X, Y, train_size=0.7)
_dim = len(x_train[0])

# define the keras model
model = Sequential()

# tanh activation allows for vals between -1 and 1 unlike relu
model.add(Dense(_dim*2, input_dim=_dim, activation=_activation))
model.add(Dense(_dim*2, activation=_activation))
model.add(Dense(_dim, activation=_activation))
# model.compile(loss=_loss, optimizer=_optimizer)
model.compile(loss=_loss, optimizer=_optimizer, metrics=[custom_loss_wrapper_2(x_train)])

print(model.summary())

# The patience parameter is the amount of epochs to check for improvement
early_stop = EarlyStopping(monitor='val_loss', patience=5)

# fit the model
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=150, batch_size=15, callbacks=[early_stop])

And this is where my custom loss function is:

def custom_loss_wrapper_2(inputs):
# source: https://stackoverflow.com/questions/55445712/custom-loss-function-in-keras-based-on-the-input-data
# 2nd source: http://stackoverflow.com/questions.55597335/how-to-use-tf-gather-in-batch
def reindex(tensor_tuple):
    # unpack tensor tuple
    y_true = tensor_tuple[0]
    y_pred = tensor_tuple[1]
    t_inputs = K.cast(tensor_tuple[2], dtype='int64')
    t_max_indices = K.tf.where(K.tf.equal(t_inputs, K.max(t_inputs)))

    # gather the values from y_true and y_pred
    y_true_gathered = K.gather(y_true, t_max_indices)
    y_pred_gathered = K.gather(y_pred, t_max_indices)

    print(K.mean(K.square(y_true_gathered - y_pred_gathered)))

    return K.mean(K.square(y_true_gathered - y_pred_gathered))

def custom_loss(y_true, y_pred):
    # Step 1: "tensorize" the previous list
    t_inputs = K.variable(inputs)

    # Step 2: Stack tensors
    tensor_tuple = K.stack([y_true, y_pred, t_inputs], axis=1)

    vals = K.map_fn(reindex, tensor_tuple, dtype='float32')
    print('vals: ', vals)

    return K.mean(vals)

return custom_loss

I am getting the following error message for one of my attempts at a custom loss function:

Using TensorFlow backend.
WARNING: Logging before flag parsing goes to stderr.
W0722 15:28:20.239395 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0722 15:28:20.252325 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0722 15:28:20.253353 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0722 15:28:20.280281 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0722 15:28:20.293246 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py:1521: The name tf.log is deprecated. Please use tf.math.log instead.

W0722 15:28:20.366046 17232 deprecation.py:323] From C:\Users\Madison\PycharmProjects\MSTS\Seismic_Analysis\ML\custom_loss.py:83: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Tensor("metrics/custom_loss/map/while/Mean:0", shape=(), dtype=float32)
vals:  Tensor("metrics/custom_loss/map/TensorArrayStack/TensorArrayGatherV3:0", shape=(1228,), dtype=float32)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 1002)              503004    
_________________________________________________________________
dense_2 (Dense)              (None, 1002)              1005006   
_________________________________________________________________
dense_3 (Dense)              (None, 501)               502503    
=================================================================
Total params: 2,010,513
Trainable params: 2,010,513
Non-trainable params: 0
_________________________________________________________________
None
W0722 15:28:20.467779 17232 deprecation_wrapper.py:119] From C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

Train on 1228 samples, validate on 527 samples
Epoch 1/150
2019-07-22 15:28:20.606792: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
Traceback (most recent call last):
  File "C:/Users/Madison/PycharmProjects/MSTS/Seismic_Analysis/ML/clipping_ml.py", line 172, in <module>
    main()
  File "C:/Users/Madison/PycharmProjects/MSTS/Seismic_Analysis/ML/clipping_ml.py", line 168, in main
    run_general()
  File "C:/Users/Madison/PycharmProjects/MSTS/Seismic_Analysis/ML/clipping_ml.py", line 156, in run_general
    _loss=_loss, _activation=_activation, _optimizer=_optimizer)
  File "C:/Users/Madison/PycharmProjects/MSTS/Seismic_Analysis/ML/clipping_ml.py", line 59, in build_clipping_model
    history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=150, batch_size=15, callbacks=[early_stop])
  File "C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\engine\training.py", line 1039, in fit
    validation_steps=validation_steps)
  File "C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\engine\training_arrays.py", line 199, in fit_loop
    outs = f(ins_batch)
  File "C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "C:\Users\Madison\PycharmProjects\MSTS\venv\lib\site-packages\tensorflow\python\client\session.py", line 1458, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.**InvalidArgumentError: Shapes of all inputs must match**: values[0].shape = [15,501] != values[2].shape = [1228,501]
     [[{{node metrics/custom_loss/stack}}]]
Madison Sheridan
  • 81
  • 1
  • 1
  • 7

2 Answers2

5

After a little more thought I have found the answer to my original question. I figured I would post it on here just in case it might help someone in the future. The issue I was having had to do with the input parameter I was providing my loss function wrapper. I was passing in the entire array of inputs when I should have only been passing in the batch inputs. That is done during the function call by sending in model.inputs. So the new compile line should look like:

model.compile(loss=_loss, optimizer=_optimizer, metrics=[custom_loss_wrapper_2(model.input)])
Madison Sheridan
  • 81
  • 1
  • 1
  • 7
0

Can you share a runnable but failing example of the problem? Even with just a few data points. Right now it looks like your data is inconsistently shaped. E.g. one wavelet is longer than the other. The batch needs to be homogenous. A way to check this would be:

print(set(inp.shape for inp in inputs))

If that set has more than one element, you might have to augment your data.

Sample code from the snippets in the question

import numpy as np
from keras import backend as K
from keras.callbacks import EarlyStopping
from keras.layers import Dense, Activation
from keras.models import Sequential
from keras import optimizers
from sklearn.model_selection import train_test_split

_activation = Activation('softmax')
_optimizer = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

def custom_loss_wrapper_2(inputs):
    print("inputs {}".format(inputs.shape))
    # source: https://stackoverflow.com/questions/55445712/custom-loss-function-in-keras-based-on-the-input-data
    # 2nd source: http://stackoverflow.com/questions.55597335/how-to-use-tf-gather-in-batch
    def reindex(tensor_tuple):
        # unpack tensor tuple
        y_true = tensor_tuple[0]
        y_pred = tensor_tuple[1]
        t_inputs = K.cast(tensor_tuple[2], dtype='int64')
        t_max_indices = K.tf.where(K.tf.equal(t_inputs, K.max(t_inputs)))

        # gather the values from y_true and y_pred
        print("y_true {}".format(y_true.shape))
        print("y_pred {}".format(y_pred.shape))
        y_true_gathered = K.gather(y_true, t_max_indices)
        y_pred_gathered = K.gather(y_pred, t_max_indices)

        print(K.mean(K.square(y_true_gathered - y_pred_gathered)))

        return K.mean(K.square(y_true_gathered - y_pred_gathered))

    def custom_loss(y_true, y_pred):
        print("y_true2 {}".format(y_true.shape))
        print("y_pred2 {}".format(y_pred.shape))

        # Step 1: "tensorize" the previous list
        t_inputs = K.variable(inputs)

        # Step 2: Stack tensors
        tensor_tuple = K.stack([y_true, y_pred, t_inputs], axis=1)

        vals = K.map_fn(reindex, tensor_tuple, dtype='float32')
        print('vals: {}'.format(vals.shape))
        print('kvals: {}'.format(K.mean(vals).shape))
        return K.mean(vals, keepdims=True)

    return custom_loss

dataset_size = 100
dim = 501
X = np.random.rand(dataset_size, dim)
Y = np.random.rand(dataset_size, dim)

x_train, x_test, y_train, y_test = train_test_split(X, Y, train_size=0.7)
print(x_train.shape)
print(y_train.shape)

print(x_test.shape)
print(y_test.shape)

_dim = len(x_train[0])
print("_dim {}".format(_dim))
# define the keras model
model = Sequential()

_loss = custom_loss_wrapper_2(x_train)
_mmm = _loss

# tanh activation allows for vals between -1 and 1 unlike relu
model.add(Dense(_dim*2, input_shape=(_dim,), activation=_activation))
model.add(Dense(_dim*2, activation=_activation))
model.add(Dense(_dim, activation=_activation))
# model.compile(loss=_loss, optimizer=_optimizer)
model.compile(loss=_loss, optimizer=_optimizer, metrics=[_mmm])

print(model.summary())

# The patience parameter is the amount of epochs to check for improvement
early_stop = EarlyStopping(monitor='val_loss', patience=5)

# fit the model
history = model.fit(
    x_train,
    y_train,
    validation_data=(x_test, y_test),
    epochs=150,
    batch_size=10,
    callbacks=[early_stop])


ubershmekel
  • 11,864
  • 10
  • 72
  • 89
  • I ran that line of code and the resulting value printed on my dataset is {(501,)} so it looks like all data points at least have the same length. I ran my code using randomly generated numpy arrays and am coming up with the same problem. I used the following to generate my X and Y rather than loading those arrays from mat files like I was doing previously. X = np.random.rand(44, 501) Y = np.random.rand(44, 501) I get the same error with the shape size parameters changing. values[0].shape = [15,501] != values[2].shape = [30,501]. – Madison Sheridan Jul 23 '19 at 14:02
  • I ran out of time to look at this, but the problem is that somehow the batch size is getting into the graph computation. If you remove the validation data, it seems like there is no problem. But I suspect there is. I'm pasting to the answer my shape debugging code. – ubershmekel Jul 24 '19 at 18:46