I try to compute the proportion of the variance explained by each component in a PLSDA, using the tidymodels
framework.
Here's the "gold standard" result with the mixOmics
package:
library(mixOmics)
mix_plsda <- plsda(X = iris[-5], Y = iris$Species, ncomp = 4)
mix_var_expl <- mix_plsda$prop_expl_var$X
mix_var_expl
#> comp1 comp2 comp3 comp4
#> 0.729028323 0.227891235 0.037817718 0.005262724
sum(mix_var_expl) # check
#> [1] 1
And here with recipes::step_pls()
:
library(recipes)
tidy_plsda <-
recipe(Species ~ ., data = iris) %>%
step_pls(all_numeric_predictors(), outcome = "Species", num_comp = 4) %>%
prep()
tidy_sd <- tidy_plsda$steps[[1]]$res$sd
tidy_sd
#> [1] 0.8280661 0.4358663 1.7652982 0.7622377
tidy_sd ^2 / sum(tidy_sd^2)
#> [1] 0.14994532 0.04154411 0.68145793 0.12705264
The element that looks like the most to an explained variance is sd
, but as you can see, there is no obvious relationship between these two vectors.
How can I get mix_var_expl
from tidy_plsda
? Thanks!
Created on 2022-09-20 by the reprex package (v2.0.1)