Tensorflow equivalent of np.corrcoef on a specific axis

Question

I am trying to correlate two matrices column wise. i.e. correlate the 1st column of the 1st matrix with the 1st column of the 2nd matrix and so on. In numpy I do:

np.corrcoef(x, y, axis=0)

And it works great. What would be the Tensorflow equivalent of that command?

I tried using streaming_pearson_correlation1 but that correlates all the columns together instead of providing a result per column.

As a last resort I'm considering splitting the tensor into separate column tensors, but I'm guessing this will have a performance cost.

I know that I can wrap numpy in a py_func, but then it won't run on a GPU.

Thanks in advance for the help.

I couldn't run your example, there's no axis parameter -- https://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html — Yaroslav Bulatov, May 07 '17 at 18:22

score 5 · Answer 1 · answered May 07 '17 at 18:48

5

Documentation page for numpy corrcoef gives connection between corcoef and covariance matrix. So, natural thing is to rewrite it in terms of matmuls in numpy first:

fsize=1
dsize=3
x=np.random.random((fsize,dsize))
y=np.random.random((fsize,dsize))
xy=np.concatenate([x,y], axis=0)
(np.corrcoef(xy) == np.corrcoef(x,y)).all()
mean = np.mean(xy, axis=1, keepdims=True)
cov = ((xy-mean) @ (xy-mean).T)/(dsize-1)
cov2 = np.diag(1/sqrt(np.diag(cov)))
np.testing.assert_allclose(cov2@cov@cov2, np.corrcoef(x, y))

Now convert to TensorFlow, and check that result is the same

def t(x): return tf.transpose(x)
sess = tf.InteractiveSession()

x_t = tf.constant(x)
y_t = tf.constant(y)
xy_t = tf.concat([x, y], axis=0)
mean_t = tf.reduce_mean(xy_t, axis=1, keep_dims=True)
cov_t = ((xy_t-mean_t) @ t(xy_t-mean_t))/(dsize-1)
cov2_t = tf.diag(1/tf.sqrt(tf.diag_part(cov_t)))
cor = cov2_t @ cov_t @ cov2_t

np.testing.assert_allclose(np.corrcoef(x, y), cor.eval())

Correlations between variables that constitute x and y are in off-diagonal blocks of this matrix.

answered May 07 '17 at 18:48

Yaroslav Bulatov

57,332
22
139
197

Can you please tell me what does the `@` mean in your solution. Thank you – I. A Aug 14 '17 at 15:04
2

a@b maps to tf.matmul(a,b) – Yaroslav Bulatov Aug 14 '17 at 15:34
Thank you :). Could you please give me a handful tutorial on this notation? – I. A Aug 14 '17 at 15:47
it's a Python 3.5 feature -- https://stackoverflow.com/questions/27385633/what-is-the-symbol-for-in-python – Yaroslav Bulatov Aug 14 '17 at 15:56
Could you please tell me more why we have the following 2 line: `cov2_t = tf.diag(1/tf.sqrt(tf.diag_part(cov_t))) cor = cov2_t @ cov_t @ cov2_t` And is `dsize` is the number of features? Thank you so much – I. A Aug 14 '17 at 16:13

Tensorflow equivalent of np.corrcoef on a specific axis

1 Answers1