1

I am trying to calculate the correlation coefficient like how can I calculate correlation between all possible rows

My code import pandas as pd

d = {'Name': ['A', 'B','C'], 'v1': [1,3, 4], 'v2': [3,2, 4], 'v3': [3,9 ,1]}
df = pd.DataFrame(data=d)
result = df.T.corr().unstack().reset_index(name="corr")

but it shows the error IndexError: list index out of range.

Thank you for your assistance

Platalea Minor
  • 877
  • 2
  • 9
  • 22

1 Answers1

1
  1. you need initially to ensure Name is an index so transpose works
  2. after doing corr() you need to rename X axis
  3. finally you need to rename your columns after reset_index()
d = {'Name': ['A', 'B','C'], 'v1': [1,3, 4], 'v2': [3,2, 4], 'v3': [3,9 ,1]}
df = pd.DataFrame(data=d).set_index("Name")
result = df.T.corr()
result.columns.set_names("NameX", inplace=True)
result = result.unstack().to_frame().reset_index().rename(columns={"Name":"NameY",0:"corr"})

output

NameX NameY      corr
    A     A  1.000000
    A     B  0.381246
    A     C -0.500000
    B     A  0.381246
    B     B  1.000000
    B     C -0.991241
    C     A -0.500000
    C     B -0.991241
    C     C  1.000000
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30