TL;DR
To plot confidence intervals after a LDA analysis: Should I use the covariance matrix shared by all classes (lda.covariance_), or should I calculate and use the covariance matrix of each class ?
Long question
Some time ago, I asked a question about how to draw ellipses around points: Draw ellipses around points
These ellipses will represent confidence intervals for Linear Discriminant Analysis (LDA) data points.
I will reuse my old picture, which I got from a scientific publication:
The red points (for example) could be defined as follow, after the LDA calculations:
[[-23.88315146 -3.26328266] # first point
[-25.94906669 -1.47440904] # second point
[-26.52423229 -4.84947907]] # third point
You can see on the picture that the red points are surrounded by an ellipse, which represents the confidence interval (at a certain level) for the mean of the red points.
This is what I would like to obtain. Now scikit-learn's doc has an example about that (here):
def plot_ellipse(splot, mean, cov, color):
v, w = linalg.eigh(cov)
u = w[0] / linalg.norm(w[0])
angle = np.arctan(u[1] / u[0])
angle = 180 * angle / np.pi # convert to degrees
# filled Gaussian at 2 standard deviation
ell = mpl.patches.Ellipse(mean, 2 * v[0] ** 0.5, 2 * v[1] ** 0.5,
180 + angle, color=color)
And this function is called like this:
plot_ellipse(splot, lda.means_[0], lda.covariance_, 'red')
In the doc's example, plot_ellipse
is called to draw the confidence interval of all the classes, always with the same covariance: lda.covariance
.
lda.covariance
is then used to determine the angle of the ellipses. As lda.covariance
never changes, all the ellipses will have the same angle.
Is it mathematically correct to do that ? I am tempted to say no.
On another post (multidimensional confidence intervals), which is not related to LDA, @Joe Kington simply uses a " 2-sigma ellipse of the scatter of points". He calculates the covariance for each class:
cov = np.cov(points, rowvar=False)
, where points
would be the 3 points described above, for example. He then uses a similar way to calculate the angle of the ellipses. But as he calculates the covariance matrix for each class, the angles of the ellipses are not the same across the classes.