PLS-DA algorithm in python

Question

Partial Least Squares (PLS) algorithm is implemented in the scikit-learn library, as documented here: http://scikit-learn.org/0.12/auto_examples/plot_pls.html In the case where y is a binary vector, a variant of this algorithm is being used, the Partial least squares Discriminant Analysis (PLS-DA) algorithm. Does the PLSRegression module in sklearn.pls implements also this binary case? If not, where can I find a python implementation for it? In my binary case, I'm trying to use the PLSRegression:

pls = PLSRegression(n_components=10)
pls.fit(x, y)
x_r, y_r = pls.transform(x, y, copy=True)

In the transform function, the code gets exception in this line:

y_scores = np.dot(Yc, self.y_rotations_)

The error message is "ValueError: matrices are not aligned". Yc is the normalized y vector, and self.y_rotations_ = [1.]. In the fit function, self.y_rotations_ = np.ones(1) if the original y is a univariate vector (y.shape1=1).

Did you ever resolve this? I have tried the same method (using the latest version of scikit-learn) and it seems to do PLS-DA perfectly. The key is to label classes with 1 and 0 (for same/other class). If you still can't get it to work, can you post your data? — mfitzp, Oct 04 '13 at 10:54
Haven't resolved it yet, but I'll try user3178149 solution. Thanks for offering your help! — Noam Peled, Feb 10 '14 at 07:01
@mfitzp Is partial least squares regression the same as partial least squares discriminant analysis? I am trying to figure out how to get plots from the first two components. — O.rka, Jul 29 '16 at 20:07
@O.rka correct, PLSDA for two groups is just PLS Regression against a binary variable (0 or 1) representing group membership. See [here](http://mfitzp.io/article/partial-least-squares-discriminant-analysis-plsda/) for a longer write up. — mfitzp, Jul 29 '16 at 20:29
Thanks for that. I've just recently gotten introduced to ordination and I want to understand it before I start implementing it. Wow. AMAZING tutorial — O.rka, Jul 29 '16 at 20:35

score 30 · Accepted Answer · answered Feb 09 '14 at 21:48

30

PLS-DA is really a "trick" to use PLS for categorical outcomes instead of the usual continuous vector/matrix. The trick consists of creating a dummy identity matrix of zeros/ones which represents membership to each of the categories. So if you have a binary outcome to be predicted (i.e. male/female , yes/no, etc) your dummy matrix will have TWO columns representing the membership to either category.

For example, consider the outcome gender for four people: 2 males and 2 females. The dummy matrix should be coded as :

import numpy as np
dummy=np.array([[1,1,0,0],[0,0,1,1]]).T

, where each column represents the membership to the two categories (male, female)

Then your model for data in variable Xdata ( shape 4 rows,arbitrary columns ) would be:

myplsda=PLSRegression().fit(X=Xdata,Y=dummy)

The predicted categories can be extracted from comparison of the two indicator variables in mypred:

mypred= myplsda.predict(Xdata)

For each row/case the predicted gender is that with the highest predicted membership.

answered Feb 09 '14 at 21:48

markcelo

495
5
7

4

If your data contains only two classes, it is better to present y as a single column then do regression, and identify the class using threshold of half value of the two class value, for example, if 1 is for class one and -1 for the other class, threshold is 0. There exist problems if a matrix of y is used. This is also why PLSDA is not recommended for multiclass problem. See paper _Partial least squares discriminant analysis: taking the magic away_ for detail discussion. – Elkan Jan 23 '17 at 05:44
@Elkan: if you meant [this](file:///C:/Users/adm/Downloads/Brereton_et_al-2014-Journal_of_Chemometrics-1.pdf) paper - the whole sense there - "An alternative to PLS1 is to use PLS2 [5,6]. It is not the purpose of this paper to expand on the PLS2 algorithm" -- do not forget that everything is being done with computer calculations (very quick!) & for arguments in the paper there really exists tips from linear vector algebra -- the author himself didn't prove his abusive viewpoint in the paper with certain algorithmic details he tried (whenever) – JeeyCi Aug 15 '22 at 10:26
@JeeyCi You should go through that section very detail where the author already discussed the problems using one-vs-all strategy or making the `Y` as a matrix for PLS-DA to classify multi-class data. BTW, your link is not working. – Elkan Aug 16 '22 at 03:47
[dawnloadable link](https://www.researchgate.net/profile/Mike-Lane/post/What_is_the_difference_between_principal_components_analysis_and_partial_least_squares_discriminant_analysis/attachment/5ab3e9dc4cde266d58932c8e/AS%3A607036769329152%401521740252269/download/Brereton_et_al-2014-Journal_of_Chemometrics-1.pdf) - I just don't understand why the author hesitate to use Geographically Weighting (using vectors), though of course the problem of Multicollinearity can cause biased estimations in any Analysis – JeeyCi Aug 16 '22 at 13:56
and besides, from article: "LDA with prior reduction of dimensionality can perform just as well." -- dimensionality reduction with PCA (without consideration to target dependency with features) seems Not good alternative at all. After such dim-reduction can be left only features not affecting the target. (to my viewpoint) – JeeyCi Aug 16 '22 at 14:13
In general, [17 variable selection methods in PLS](https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/cem.3226). And chapter here[6.7.7. How the PLS model is calculated](https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/how-the-pls-model-is-calculated) with additional 6.7.4. A geometric interpretation of PLS – JeeyCi Aug 17 '22 at 05:28
@Elkan: just 1 remark to avoid misunderstanding of my comment above: PLS solves the problem of multicollinearity as so as PC1 is the linear combination with the largest possible explained variation && PC2 is the best of what is left -- following assumptions of Linearity for this analysis. I just meant that if either from more than 2 features have some correlations than vector algebra helps (as geograhical distances between point) -- & this is task for PLS2 (as I understood from other sources), but in the article PLS2 algo is ommited. Anyway, thanks for the link – JeeyCi Aug 17 '22 at 10:15
[PCA+LDA vs PLS-DA](https://www.hrpub.org/download/20190930/MS4-13490705.pdf) research && [PLS-DA+LDA](https://www.researchgate.net/post/PLS-DA-or-LDA-for-NIR-analysis) discussion – JeeyCi Aug 20 '22 at 09:46

score 4 · Answer 2 · answered Aug 27 '19 at 02:15

4

You can use the Linear Discriminate Analysis package in SKLearn, it will take integers for the y value:

LDA-SKLearn

Here is a short tutorial on how to use the LDA: sklearn LDA tutorial

answered Aug 27 '19 at 02:15

Kyle54

123
8

... when having >1 latent variable - [discussed here](https://stats.stackexchange.com/questions/455551/difference-between-lda-and-pls-da) – JeeyCi Aug 15 '22 at 10:42

Vintagefiretruk · Answer 3 · 2022-08-01T15:07:32.947

I know this had been answered thoroughly but I wanted to add what worked for me just in case it's helpful for anyone else who finds this question:

I was following this tutorial which deals with this in the following way:

(I had a dataset where the variables were AD and CN)

#Create a pseudolinear Y value against which to correlate the samples 
#(in my case I used AD)
y = [g == 'AD' for g in df.columns.get_level_values(0)]
y

#Conversion of boolean values to numerical
y = np.array(y, dtype=int)
y

#continue as normal with your PLS-DA model
plsr = PLSRegression(n_components=2, scale=False)
plsr.fit(df.values.T, y)

0x90 · Answer 4 · 2022-03-21T20:50:53.237

Not exactly what you are looking for, but you might want to check these two threads about how to call to a native (c/c++ code) from a python and a c++ PLS libs implementation:

Partial Least Squares Library

Calling C/C++ from Python?

you can use boost.python to embed the c++ code into python. Here is an example taken from the official site:

Following C/C++ tradition, let's start with the "hello, world". A C++ Function:

char const* greet()
{
   return "hello, world";
}

can be exposed to Python by writing a Boost.Python wrapper:

#include <boost/python.hpp>

BOOST_PYTHON_MODULE(hello_ext)
{
    using namespace boost::python;
    def("greet", greet);
}

That's it. We're done. We can now build this as a shared library. The resulting DLL is now visible to Python. Here's a sample Python session:

>>> import hello_ext
>>> print hello_ext.greet()
hello, world

PLS-DA algorithm in python

4 Answers4