0

Is there a way to find a correlation coefficient, or some equivalent measure, of more than 2 variables/columns of data? I have a large number of columns of data in Python, and I've found the correlation coefficient for all possible pairs of columns using np.corrcoef(), but I would like to somehow produce coefficients when relating 3 or more columns at once.

How I imagine this is I want to plot all the data points from 3 columns into a 3D space and find the vector that best fits the data, and how strongly linear the relationship is.

Kevin
  • 74,910
  • 12
  • 133
  • 166
  • 1
    Look into PCA (principal component analysis). May not be exactly what you're looking for, but will give you a better understanding of the driving variables in your dataframe. – rahlf23 Jun 26 '18 at 15:14
  • Thanks, I was thinking about looking into PCA earlier. I'm just curious why there aren't any obvious ways to do this particular task. Only 2D correlation seems to be done. – javafrapp90 Jun 26 '18 at 15:17
  • This SO post may be useful as well: https://stackoverflow.com/q/2298390/8146556 – rahlf23 Jun 26 '18 at 15:18

0 Answers0