R MASS::lda using cov.mve method - reproducability issues

Question

I am trying to model some data, using LDA, which is multivariate non-normal. I was hoping to get a more robust estimation, by choosing method = 'mve'. However this leads to variable predictions - minimal example supplied.

library(MASS)
library(caret)
set.seed(1)

data(iris)

acc <- list()
for (i in 1:100) {
    post_hoc <- lda(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
    data=iris , method = 'mve')
    conf <- table(list(predicted=predict(post_hoc)$class , observed=iris$Species ))
    acc <- append(acc, as.numeric(confusionMatrix(conf)$overall[1]))
    }
hist(as.numeric(acc))

Looking at the lda.R code I see it does not set a seed for cov.rov function. How can I get a reproducible example?

jay.sf · Answer 1 · 2022-06-02T18:51:55.927

0

If you set.seed before lda, results will be identical, see and wonder:

f <- \() {
  acc <- list()
  for (i in 1:100) {
    set.seed(1)
    post_hoc <- lda(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
                    data=iris , method = 'mve')
    conf <- table(list(predicted=predict(post_hoc)$class , observed=iris$Species ))
    acc <- append(acc, as.numeric(confusionMatrix(conf)$overall[1]))
  }
  acc
}

library(MASS); library(caret)

acc1 <- f()
all(sapply(acc1, all.equal, acc1[[1]]))
# [1] TRUE

edited Jun 02 '22 at 18:51

answered Jun 02 '22 at 18:00

jay.sf

60,139
8
53
110

The issue is not that two runs are identical, there should not be variation in a single run. If you use a different 'method', e.g. moment/t/mle you get reproducible results. Isn't setting the seed suppose to instruct the 'cov.rob' method to sample the same subset of data points and produce the same results? – Israel Zadok Jun 02 '22 at 18:27
@IsraelZadok Well, then set it before `lda` which is stochastic, see update. – jay.sf Jun 02 '22 at 18:53

Israel Zadok · Answer 2 · 2022-06-02T19:11:23.130

0

O.K., I've edited a version of lda.R with a set.seed() and the results are reproducible. This is strange.

edited Jun 02 '22 at 19:11

answered Jun 02 '22 at 18:58

Israel Zadok

31
4

R MASS::lda using cov.mve method - reproducability issues

2 Answers2