39

I want to implement my custom metric in Keras. According to the documentation, my custom metric should be defined as a function that takes as input two tensors, y_pred and y_true, and returns a single tensor value.

However, I'm confused to what exactly will be contained in these tensors y_pred and y_true when the optimization is running. Is it just one data point? Is it the whole batch? The whole epoch (probably not)? Is there a way to obtain these tensors' shapes?

Can someone point to a trustworthy place where I can get this information? Any help would be appreciated. Not sure if relevant, but I'm using TensorFlow backend.


Things I tried so far, in order to answer this:

  • Checking the Keras metrics documentation (no explanation there about what these tensors are).
  • Checking the source code for the Keras metrics and trying to understand these tensors by looking at the Keras implementation for other metrics (This seems to suggest that y_true and y_pred have the labels for an entire batch, but I'm not sure).
  • Reading these stackoverflow questions: 1, 2, 3, and others (none answer my question, most are centered on the OP not clearly understanding the difference between a tensor and the values computed using that tensor during the session).
  • Printing the values of y_true and y_pred during the optimization, by defining a metric like this:
    def test_metric(y_true, y_pred):
        y_true = K.print_tensor(y_true)
        y_pred = K.print_tensor(y_pred)
        return y_true - y_pred

(unfortunately these don't print anything during the optimization).

charlesreid1
  • 4,360
  • 4
  • 30
  • 52
JLagana
  • 1,224
  • 4
  • 14
  • 33

3 Answers3

43

y_true and y_pred

The tensor y_true is the true data (or target, ground truth) you pass to the fit method.
It's a conversion of the numpy array y_train into a tensor.

The tensor y_pred is the data predicted (calculated, output) by your model.

Usually, both y_true and y_pred have exactly the same shape. A few of the losses, such as the sparse ones, may accept them with different shapes.


The shape of y_true

It contains an entire batch. Its first dimension is always the batch size, and it must exist, even if the batch has only one element.

Two very easy ways to find the shape of y_true are:

  • check your true/target data: print(Y_train.shape)
  • check your model.summary() and see the last output

But its first dimension will be the batch size.

So, if your last layer outputs (None, 1), the shape of y_true is (batch, 1). If the last layer outputs (None, 200,200, 3), then y_true will be (batch, 200,200,3).


Custom metrics and loss functions

Unfotunately, printing custom metrics will not reveal their content (unless you are using eager mode on, and you have calculated every step of the model with data).
You can see their shapes with print(K.int_shape(y_pred)), for instance.

Remember that these libraries first "compile a graph", then later "runs it with data". When you define your loss, you're in the compile phase, and asking for data needs the model to run.

But even if the result of your metric is multidimensional, keras will automatically find ways to output a single scalar for that metric. (Not sure what is the operation, but very probably a K.mean() hidden under the table - it's interesting to return the entire batch, so Keras applies other operations such as sample weights, for instance).

Sources. After you get used to keras, this understanding gets natural from simply reading this part:

y_true: True labels. Theano/TensorFlow tensor.
y_pred: Predictions. Theano/TensorFlow tensor of the same shape as y_true.

True labels mean true/target data. Labels is a badly chosen word here, it is only really "labels" in classification models.
Predictions mean the results of your model.

Community
  • 1
  • 1
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • I have a question regarding y_true. My training data (numpy array) has shape (100,). However, inside a metric, e.g. accuracy it has shape (TensorShape([Dimension(None), Dimension(None)]). Then, in the keras accuracy metric they compute K.max(y_true, axis=-1). What is the second dimension? Why do they take the argmax over this dimension instead of the first one? – Lemon Nov 02 '17 at 08:48
  • 1
    If "yTrain". Is (100,), it probably changed it to (100,1). This accuracy metric supposes you are using one hot classes. – Daniel Möller Nov 02 '17 at 13:31
  • Okay. So when not using one hot classes I would have to change the accuracy computation to `K.max(y_true, axis=0)`? – Lemon Nov 03 '17 at 10:20
  • 1
    We need to understand what your data is to answer that. Is it a binary (0 or 1) result? If so, you can use `binary_crossentropy` as loss function, and keras will automatically use a suited accuracy for that, based on `K.round(y_pred)`. - https://github.com/fchollet/keras/blob/master/keras/metrics.py – Daniel Möller Nov 06 '17 at 11:25
  • Can anyone tell us how to implement custom metrics that calculate mean(y_pred - y_true)? I just want average value of the difference between predicted value and true value – Katelynn ruan Mar 07 '19 at 02:46
  • 1
    Use `metrics=['mae']` (Mean absolute error), or use `def metr(true, pred): return K.mean(pred-true)` with `metrics=[metr]` – Daniel Möller Mar 07 '19 at 02:49
  • @DanielMöller I want to be sure if `y_pred` has the same shape as `y_true` from the last section of your comment. E.g. if my model output `y_pred` has shape `[None, seq_length, feature_size]` then `y_true` is also a 3-D tensor (verified) though I pass only 2-D tensor in `fit` method. So the last comment should be read as `y_true` has shape same as `y_pred`. – CKM Jul 27 '19 at 11:10
  • "Both y_true and y_pred have exactly the same shape, always." Across all dimensions? E.g., in this question (https://stackoverflow.com/q/58386664/829332), y_pred is (None, 6) but (I assumed, perhaps wrongly) y_true is (None, 1). – Dan Oct 15 '19 at 20:53
  • Well, therer are losses that accept a different y_true, especially the "sparse" types. The usual is an exact shape, though. – Daniel Möller Oct 15 '19 at 20:55
  • @DanielMöller. Q1. I am passing `y_true` as 2 D array and `y_pred` is a 3 D array. But, within custom loss, `y_true` becomes 3 D rather than `y_pred` becoming 2 D which is contrary to your last comment. (`y_pred`'s shape should change to `y_true`s shape). Why is this? The point is should `y_true` shape change to `y_pred` shape or vice-versa to match the shape? – CKM Dec 08 '19 at 13:22
  • If it's a custom loss, you should control it the way you want. – Daniel Möller Dec 08 '19 at 13:33
0

y_true is the true value (labels). and y_pred is values which your NN model predicted.

The size (shape) of the tensors is determent by size of the batches (nb_batches).

Vadim
  • 4,219
  • 1
  • 29
  • 44
  • Could elaborate your answer a bit more? Let's say that the output of my classifier network is N-dimensional (i.e. pmf for N classes), and my batch size is B. Then the shape of, e.g, `y_true`, would be (N,B) or (B,N)? Or something else? – JLagana Oct 10 '17 at 11:37
  • Also, can you point to any references that support your statement? – JLagana Oct 10 '17 at 11:39
-1

y_true is the target values and y_pred is the predicted value from the model. The parameter position in the function is also important. You can check by implementing only one example and you can observe by using the function as metrics. Note:- While checking this property avoid using validation split since there are not enough examples for a split to happen and also avoid scaling the examples for better visualizations