Computing row-wise correlation coefficients between two 2d arrays in Python

Question

I have two numpy arrays of identical size M X T (let's call them A and B). I'd like to compute the Pearson correlation coefficient across T between each pair of the same row m in A and B (so, A[i,:] and B[i,:], then A[j,:] and B[j,:]; but never A[i,:] and B[j,:], for example).

I'm expecting my output to be either a one-dimensional array with shape (M,) or a two-dimensional array with shape (M,1).

The arrays are quite large (on the order of 1-2 million rows), so I'm looking for a vectorized solution that will let me avoid a for-loop. Apologies if this has already been answered, but it seems like many of the code snippets in previous answers (e.g., this one) are designed to give the full M X M correlation matrix -- i.e., correlation coefficients between all possible pairs of rows, rather than just index-matched rows; what I am looking for is basically just the diagonal of this matrix, but it feels wasteful to calculate the whole thing if all I need is the diagonal -- and in fact it's throwing memory errors when I try to do that anyway....

What's the fastest way to implement this? Thanks very much in advance.

score 1 · Accepted Answer · answered Dec 03 '18 at 21:54

1

I think I'd just use a list-comprehension and a module for calculating the coefficient:

from scipy.stats.stats import pearsonr
import numpy as np

M = 10
T = 4
A = np.random.rand(M*T).reshape((M, T))
B = np.random.rand(M*T).reshape((M, T))
diag_pear_coef = [pearsonr(A[i, :], B[i, :])[0] for i in range(M)]

Does that work for you? Note that pearsonr returns more than just the correlation coefficient, hence the [0] indexing.
Good luck!

answered Dec 03 '18 at 21:54

ShlomiF

2,686
1
14
19

Yes, thank you! I figured it was something simple like this but I'm new to Python and still wrapping my head around list comprehension. Thanks again. – Emily Finn Dec 04 '18 at 02:12
Interesting that there is no implemented function for this in numpy or scipy? – Johannes Wiesner Aug 26 '21 at 13:15

Computing row-wise correlation coefficients between two 2d arrays in Python

1 Answers1