0

I'm trying to run a lda function but i get this error. My dataset has 388 observations and 1026 variables. The column Act has only "n" or "p", and the other columns have numbers. The code is:

dat<-bbb.fingerprints

head(dat)

N<-nrow(dat)

smp<-sample(1:N, N/3)

smp

train<-dat[-smp, ]

test<-dat[smp, ]

library(MASS)

lda.model <- lda(Act ~ . , data=train)

View(lda.model)

The error is:

Error in lda.default(x, grouping, ...) : 
  variables   18   21   29   39   55   56   59   70   94  104  114  138  150  162  184  199  205  248  268  371  374  383  443  444  450  451  515  535  537  538  554  583  606  619  620  628  636  646  649  655  720  733  756  757  784  798  806  846  849  852  860  867  908  939  978  987  996 1000 1001 appear to be constant within groups

Can you help me pleaseeee?

Cristian E. Nuno
  • 2,822
  • 2
  • 19
  • 33
Catarina Franco
  • 1
  • 1
  • 1
  • 1
  • 2
    Welcome to StackOverflow. Please provide a [minimum reproducible example](https://stackoverflow.com/a/5963610/2359523) to aid others in answering your question. – Anonymous coward Oct 04 '18 at 17:23
  • Without seeing the data, the error you are getting is because all of those variables are co-linear. Have you done a draftsman's plot and seen if any appear constant? – Anonymous coward Oct 04 '18 at 17:29
  • 1
    I doubt that `lda` can handle situations like yours with fewer observations than variables. This isn't the cause of that error, but it will become apparent after you solve the current difficulties. I think you need the assistance of a statistician to plan your analysis. – IRTFM Oct 04 '18 at 20:19

1 Answers1

0

Saw this on reddit: if x is a data frame with the 17th column being the grouping variable and the rest are the features then run LDA using the following:

lda(x[,-17], grouping=x[,17])
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69