I have two numpy arrays of identical size M X T
(let's call them A
and B
). I'd like to compute the Pearson correlation coefficient across T between each pair of the same row m in A and B (so, A[i,:]
and B[i,:]
, then A[j,:]
and B[j,:]
; but never A[i,:]
and B[j,:]
, for example).
I'm expecting my output to be either a one-dimensional array with shape (M,)
or a two-dimensional array with shape (M,1)
.
The arrays are quite large (on the order of 1-2 million rows), so I'm looking for a vectorized solution that will let me avoid a for-loop. Apologies if this has already been answered, but it seems like many of the code snippets in previous answers (e.g., this one) are designed to give the full M X M
correlation matrix -- i.e., correlation coefficients between all possible pairs of rows, rather than just index-matched rows; what I am looking for is basically just the diagonal of this matrix, but it feels wasteful to calculate the whole thing if all I need is the diagonal -- and in fact it's throwing memory errors when I try to do that anyway....
What's the fastest way to implement this? Thanks very much in advance.