Covariance Matrix
A covariance matrix is a nxn symmetric matrix where n is the number of columns of the matrix you are starting with and shows how the vector variables covariate, meaning how they tend to move in respect to one another.
Components
On the main diagonal you find the variance of the vector and on all other coordinates you find the covariance since var(X) = cov(X, X).
Positive and negative coefficients
In the main diagonal no values can be negative since they represent the variance of a vector. On any other position, the covariance can be obtained as a product of two standard deviations (that are always non-negative) (s(X) and s(Y)) and the Pearson correlation coefficient p that instead varies between [-1, 1]: this is the coefficient that makes the values positive or negative.
cov(X, Y) = p(X,Y)s(X)s(Y)
There are three possibilities:
- p(X, Y)==0: no correlation between the vectors.
- p(X,Y)>0: positive correlation, meaning that when the vector X grows so does the magnitude of Y.
- p(X,Y)<0: negative correlation, meaning that when the vector X grows, the magnitude of Y decreases.
The standard deviations effect on the coefficients in the matrices is "just" magnitude, meaning they highlight more correlation when the standard deviation of the data points is higher.
Visualization
To better visualise the content of the matrix I am using the heatmap
function from the seaborn
python package. Also I
have added the correlation matrix to better compare the results.
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
x = np.array([[10,39,19,23,28],
[43,13,32,21,20],
[15,16,22,85,15]])
plt.rcParams['figure.figsize'] = [10, 5]
plt.axis('scaled')
plt.subplot(1,2,1)
sns.heatmap(np.cov(x),
annot=True,
cbar = False,
fmt="0.2f",
cmap="YlGnBu",
xticklabels=range(len(x)),
yticklabels=range(len(x)))
plt.title("Covariance matrix")
plt.subplot(1,2,2)
sns.heatmap(np.corrcoef(x),
annot=True,
cbar = False,
fmt="0.2f",
cmap="YlGnBu",
xticklabels=range(len(x)),
yticklabels=range(len(x)))
plt.title("Correlation matrix")
Output:

Interpretation
The third vector, when compared with the others, has an exceptionally high variance. All the vectors have a negative correlations, in particular the vector 1 and 2 that are strongly correlated. The vectors 1 and 3 are the least correlated.