8

I've tried to find a way to display correlation coefficients in the lower or upper tri of a pandas scatter matrix - can someone point me in the right direction? Thank you.

Chuck
  • 135
  • 1
  • 3
  • 9
  • 1
    Use the pandas `pandas.corr()` function? http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.corr.html – alacy Jan 04 '15 at 23:15

1 Answers1

25

A working minimal example

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from pandas.plotting import scatter_matrix
df = pd.DataFrame(np.random.randn(100, 4), columns=['a', 'b', 'c', 'd'])
axes = scatter_matrix(df, alpha=0.5, diagonal='kde')
corr = df.corr().to_numpy()
for i, j in zip(*plt.np.triu_indices_from(axes, k=1)):
    axes[i, j].annotate("%.3f" %corr[i,j], (0.8, 0.8), xycoords='axes fraction', ha='center', va='center')
plt.show()

https://i.stack.imgur.com/apXPu.png

Ohjeah
  • 1,269
  • 18
  • 24
  • 3
    You can also specify the figure size: scatter_matrix(df, alpha=0.5, figsize=(15, 10), diagonal='kde') – Sean Nguyen May 26 '17 at 12:40
  • anyone looking to plot the lower triangle numbers replace triu with tril https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.tril.html#numpy.tril – scatter Aug 08 '19 at 09:28
  • 2
    `as_matrix()` is long deprecated. Use `to_numpy()` instead. – Mr. T Mar 05 '22 at 13:45