6

I'm pretty new to Keras and I'm trying to define my own metric. It calculates concordance index which is a measure for regression problems.

def cindex_score(y_true, y_pred):
    sum = 0
    pair = 0    
    for i in range(1, len(y_true)):
        for j in range(0, i):
            if i is not j:
                if(y_true[i] > y_true[j]):
                  pair +=1
                  sum +=  1* (y_pred[i] > y_pred[j]) + 0.5 * (y_pred[i] == y_pred[j])
    if pair is not 0:
        return sum/pair
    else:
        return 0


def baseline_model(hidden_neurons, inputdim):
    model = Sequential()
    model.add(Dense(hidden_neurons, input_dim=inputdim, init='normal', activation='relu'))
    model.add(Dense(hidden_neurons, init='normal', activation='relu'))
    model.add(Dense(1, init='normal')) #output layer

    model.compile(loss='mean_squared_error', optimizer='adam', metrics=[cindex_score])
    return model

def run_model(P_train, Y_train, P_test, model):
    history = model.fit(numpy.array(P_train), numpy.array(Y_train), batch_size=50, nb_epoch=200)
    plotLoss(history)
    return model.predict(P_test)

baseline_model, run_model and cindex_score functions are in one.py and the following function is in two.py where I called the model,

def experiment():
    hidden_neurons = 250
    dmodel=baseline_model(hidden_neurons, train_pair.shape[1])
    predicted_Y = run_model(train_pair,train_Y, test_pair, dmodel)

But I get the following error, "object of type 'Tensor' has no len()". It does not work with shape attribute as well.

For instance, y_true is represented as Tensor("dense_4_target:0", shape=(?, ?), dtype=float32) and its shape is Tensor("strided_slice:0", shape=(), dtype=int32).

Could you please help me about how to iterate within a Tensor object?

Best,

patti_jane
  • 3,293
  • 5
  • 21
  • 26

3 Answers3

3

If you are comfortable using tensorflow, then you can try using this code instead:

def cindex_score(y_true, y_pred):

    g = tf.subtract(tf.expand_dims(y_pred, -1), y_pred)
    g = tf.cast(g == 0.0, tf.float32) * 0.5 + tf.cast(g > 0.0, tf.float32)

    f = tf.subtract(tf.expand_dims(y_true, -1), y_true) > 0.0
    f = tf.matrix_band_part(tf.cast(f, tf.float32), -1, 0)

    g = tf.reduce_sum(tf.multiply(g, f))
    f = tf.reduce_sum(f)

    return tf.where(tf.equal(g, 0), 0.0, g/f)

Here is some code that verifies that both approaches are equivalent:

def _ref(J, K):
    _sum = 0
    _pair = 0
    for _i in range(1, len(J)):
        for _j in range(0, _i):
            if _i is not _j:
                if(J[_i] > J[_j]):
                  _pair +=1
                  _sum +=  1* (K[_i] > K[_j]) + 0.5 * (K[_i] == K[_j])
    return 0 if _pair == 0 else _sum / _pair

def _raw(J, K):

    g = tf.subtract(tf.expand_dims(K, -1), K)
    g = tf.cast(g == 0.0, tf.float32) * 0.5 + tf.cast(g > 0.0, tf.float32)

    f = tf.subtract(tf.expand_dims(J, -1), J) > 0.0
    f = tf.matrix_band_part(tf.cast(f, tf.float32), -1, 0)

    g = tf.reduce_sum(tf.multiply(g, f))
    f = tf.reduce_sum(f)

    return tf.where(tf.equal(g, 0), 0.0, g/f)


for _ in range(100):
    with tf.Session() as sess:
        inputs = [tf.placeholder(dtype=tf.float32),
                  tf.placeholder(dtype=tf.float32)]
        D = np.random.randint(low=10, high=1000)
        data = [np.random.rand(D), np.random.rand(D)]

        r1 = sess.run(_raw(inputs[0], inputs[1]),
                      feed_dict={x: y for x, y in zip(inputs, data)})
        r2 = _ref(data[0], data[1])

        assert np.isclose(r1, r2)

Please note that this only works for 1D-tensors (rarely a case you will have in keras).

Pedia
  • 1,432
  • 2
  • 11
  • 17
  • Thanks for the answer. I'm not comfortable with tensorflow at all. I'm trying to understand your translation. Though, to see the code above worked, i included "import tensorflow as tf" I got these two consecutive errors, the first is "a" variable is not defined, and then "return tf.where(tf.equal(g, 0), 0.0, g/f) TypeError: where() takes from 1 to 2 positional arguments but 3 were given" – patti_jane Apr 24 '17 at 20:39
  • Which version of tensorflow do you have ? – Pedia Apr 25 '17 at 12:45
  • Also, at which line did you get the undefined variable ? – Pedia Apr 25 '17 at 13:03
  • There was a typo indeed and I updated it. Still you would need to import tensorflow and numpy. You will also need to check your version of tensorflow. – Pedia Apr 25 '17 at 14:27
  • my version is 0.11.0, I had to replace subtract with sub, and multiply with mul. Undefined variable error is like this, " f = tf.sub(tf.expand_dims(y_true, -1), a) > 0.0 NameError: name 'a' is not defined" – patti_jane Apr 25 '17 at 19:17
  • My bad: I updated the code (a should be y_true). Anyway, the version that you have is old ... can you update it to the current version 1.0.1 ? – Pedia Apr 25 '17 at 20:10
  • If you can't update you tensorflow libs, then I guess the old alternative for 'where' was 'select' but you may need to double-check. – Pedia Apr 25 '17 at 20:17
  • Thanks a lot!! Replacing where with select, it works now! However, could you elaborate a little how we get rid of the for loop/ it will be great if you refer to some places I can look up and better understand this for future cases? – patti_jane Apr 25 '17 at 21:56
  • Which for-loop exactly ? Do you mean vectorizing the code. Well, I simply used matrix functions instead of explicit loops. For instance, if you need to get all elements in a vector V that are larger than all previous elements ... then you need the lower (or upper) triangle of the matrix of pair-wise differences. – Pedia Apr 25 '17 at 22:22
  • Just out of curiosity, what made you interested in this metric ? and in which literature did you find it ? – Pedia Apr 25 '17 at 23:30
  • Thanks a lot! I'm trying to evaluate the performance of a regression problem and saw that concordance index is used for such problems – patti_jane Apr 26 '17 at 13:46
3

I used @Pedia code for 3D-tensors to compute Rank loss for multi label classification:

def rloss(y_true, y_pred):
    g = tf.subtract(tf.expand_dims(y_pred[1], -1), y_pred[1])
    g = tf.cast(g == 0.0, tf.float32) * 0.5 + tf.cast(g > 0.0, tf.float32)
    f = tf.subtract(tf.expand_dims(y_true[1], -1), y_true[1]) > 0.0
    f = tf.matrix_band_part(tf.cast(f, tf.float32), -1, 0)
    g = tf.reduce_sum(tf.multiply(g, f))
    f = tf.reduce_sum(f)
return tf.where(tf.equal(g, 0), 0.0, g/f)


model = Sequential()
model.add(Dense(label_length, activation='relu'))
model.add(Dense(label_length, activation='relu'))
model.add(Dense(label_length, activation='sigmoid'))
model.summary()


adgard = optimizers.Adagrad(lr=0.01, epsilon=1e-08, decay=0.0)
model.compile(loss='binary_crossentropy',
          optimizer=adgard, metrics=[rloss])
model.fit(X_train, y_train,
      batch_size=batch_size,
      epochs=n_epoch,
      validation_data=(X_test, y_test),
      shuffle=True)
1

Replace len(y_true) with y_true.shape[0]

If on an older version of TensorFlow use y_true.get_shape()

pyCthon
  • 11,746
  • 20
  • 73
  • 135
  • I'm having the following error, ylen=y_true.shape[0] AttributeError: 'Tensor' object has no attribute 'shape'. – patti_jane Apr 23 '17 at 21:59
  • See this answer, you may be on an older version of TF. http://stackoverflow.com/questions/38666040/tensorflow-attributeerror-tensor-object-has-no-attribute-shape/ "in versions prior to TensorFlow 1.0 tf.Tensor doesn't have a .shape property. You should use the Tensor.get_shape()" – pyCthon Apr 23 '17 at 22:03
  • Sorry I responded too fast, I corrected as get_shape() but I'm having this error now, " for i in range(1, ylen): TypeError: __index__ returned non-int (type NoneType)" – patti_jane Apr 23 '17 at 22:04
  • you might just need `ylen = y_true.get_shape()[0].value` – pyCthon Apr 23 '17 at 22:09
  • 1
    still the same error, TypeError: 'NoneType' object cannot be interpreted as an integer – patti_jane Apr 23 '17 at 22:12
  • Try using keras function: K.int_shape(y_true) – Pedia Apr 24 '17 at 05:21
  • 2
    @Pedia with K.int_shape(y_true)[0], it goes back to NoneType error. – patti_jane Apr 24 '17 at 06:53