0

I'm working with anaconda the next code to take out the correlation coefficient between two matrix. The first matrix read 16 files of matrix upper left. The sum is to get the average to compare with the result of another file

`` `python
for i in range(0,16):
    i = i + 5
    file = pd.read_csv(path,header=None)
    file=file.fillna(0)
    file = pd.DataFrame(file)
    matrix = np.matrix(file)
    matrix = np.flip(matrix, 1)
    b = np.copy(matrix) 
    b = np.swapaxes(b, 1, 0)
    np.fill_diagonal(b, 0)
    c = matrix + b
    sum = c.sum(0) / c.shape[0]
    sum=pd.DataFrame(sum)
    file2 = pd.read_csv(path,header=None)
    file2=pd.DataFrame(file2)
    file2 = file2.drop(file2.columns[48], axis=1)

` ``

the correlation coefficient between two files if sum is a matrix of (1,48) and file2 is a matrix of (16,48).

StepN
  • 1
  • 2

1 Answers1

1

I did a bit research and hopefully below can help:

  1. numpy.corrcoef
numpy.corrcoef(x, y=None, rowvar=True, bias=<no value>, ddof=<no value>)

Return Pearson product-moment correlation coefficients.

  1. Computing the correlation coefficient between two multi-dimensional arrays

Correlation (default 'valid' case) between two 2D arrays:

You can simply use matrix-multiplication np.dot like so -

out = np.dot(arr_one,arr_two.T)

Correlation with the default "valid" case between each pairwise row combinations (row1,row2) of the two input arrays would correspond to multiplication result at each (row1,row2) position.

Please clarify your question in case I misunderstood.

henrywongkk
  • 1,840
  • 3
  • 17
  • 26
  • Thanks! But i tried in this case the first point and corrwith. Here are some things to note: The numpy function correlate requires input arrays to be one-dimensional. The numpy function corrcoef accepts two-dimensional arrays, but they must have the same shape. – StepN Sep 20 '19 at 04:28
  • Correlation between two sets of data requires them to be the same size by definition. Can you tell us your ultimate goal? you may want to give a simplified example by sum = matrix of (1, 4) and file2 = matrix of (3, 4), to illustrate. – henrywongkk Sep 20 '19 at 04:57
  • Compare data by row of the sum with the data per column of the other file, and then take out the correlation. – StepN Sep 20 '19 at 16:58