-1

I am very new on Python. I have a panda dataframe here. It looks like a 2D matrix with 26 columns and 9047943 rows. Let say:

array([[123,234,345],
       [567,543,342],
       [735,276,697]])

This time I want to calculate the correlation coefficient and the p-value for each row. i.e. The correlation coefficient and the p-value for [123,234,345]. Then go to the next row [567,543,342]. I think the answer should look like this T-test in Pandas

I have done lot of research but I cannot find the answer. Any suggestion? Thanks a lot for your help!

1 Answers1

0

May be something like this. Let's say your data frame is df and given that all the columns in your df are int/float:

import numpy as np
df.apply(lambda x: np.corrcoef(x), axis=1)
YOLO
  • 20,181
  • 5
  • 20
  • 40
  • Thank you for your reply. The question is updated. Please read it. – Timothy Leung Mar 02 '18 at 11:24
  • Thanks. But my question is, you want to calculate the p value of 3 numbers ? I believe p-value is calculated for comparison of two samples, which you don't have. – YOLO Mar 02 '18 at 14:45