I'm trying to do the MFA analysis of the final data frame with all of the information about the 4 studied species.
library(FactoMineR)
ff <- final_frame
ff <- ff[, colSums(is.na(ff)) != nrow(ff)] #making sure there are no missing values
for (i in 1: (d+2)) # d is the number of characters columns. Turning all categorical input into factors
{
ff[,i] <- as.factor(final_frame[,i])
}
dim<- dim(ff)[2] - maxPredict*2 - d -2
res = MFA(ff[,2:dim(ff)[2]], group=c(1, d, maxPredict*2, dim), type=c("n", "n" ,"s", "s"), ncp=2, name.group=c("hyp.species", "characters", "morphometrics", "climatic"))
plot(res,choix="ind",partial="all")
However, I keep getting an error message saying:
Error in eigen(crossprod(t(X), t(X)), symmetric = TRUE) :
infinite or missing values in 'x'
This is weird because my data frame doesn't contain any infinite or NA values.
> ff
cur_species hyp_species Character1 Character2 X1 X2 X1.1
1103A 0 2 0 0 0.6647259 0.05703327 1139
1103B 0 0 0 0 0.6647259 0.05703327 1183
1103C 1 1 1 1 0.6647259 0.05703327 1196
1103D 2 1 2 1 0.6647259 0.05703327 1160
X2.1
1103A -0.383871
1103B -4.183870
1103C 0.320000
1103D 1.845160
I have even checked it with str(ff) and many other functions. There are no missing values.
> str(ff)
'data.frame': 4 obs. of 8 variables:
$ cur_species: Factor w/ 3 levels "0","1","2": 1 1 2 3
$ hyp_species: Factor w/ 3 levels "0","1","2": 3 1 2 2
$ Character1 : Factor w/ 3 levels "0","1","2": 1 1 2 3
$ Character2 : Factor w/ 2 levels "0","1": 1 1 2 2
$ X1 : num 0.665 0.665 0.665 0.665
$ X2 : num 0.057 0.057 0.057 0.057
$ X1.1 : num 1139 1183 1196 1160
$ X2.1 : num -0.384 -4.184 0.32 1.845
I have spent days checking the whole script and trying to figure out what's wrong, but I still have no idea how to make this work. The weirdest thing is that the same script works perfectly for a much larger 40 sample data frame which actually contains NA's.
> str(ff)
'data.frame': 40 obs. of 16 variables:
$ cur_species: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ hyp_species: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Character1 : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Character2 : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ X1 : num NA 0.665 0.665 NA NA ...
$ X2 : num NA -0.00142 -0.00142 NA NA ...
$ X3 : num NA 0.00232 0.00232 NA NA ...
$ X4 : num NA 0.057 0.057 NA NA ...
$ X5 : num NA 0.0479 0.0479 NA NA ...
$ X6 : num NA 0.0487 0.0487 NA NA ...
$ X1.1 : num 1550 1718 1718 NA NA ...
$ X2.1 : num 218 209 209 NA NA ...
$ X3.1 : num 197 193 193 NA NA ...
$ X4.1 : num 0.199 0.104 0.104 NA NA ...
$ X5.1 : num 1.03 1.12 1.12 NA NA ...
$ X6.1 : num 11.9 13.5 13.5 NA NA ...
Does anyone have any ideas what may be causing this problem and how to fix this? I would greatly appreciate any help!
P.S.: Sorry, I had to make this reproducible. Here you can download final_frame.csv - http://www.sharecsv.com/s/752a1b1646013e511e851e99cf969445/final_frame.csv
final_frame <- read.csv("final_frame.csv", header = T, sep=",", row.names = 1)
Run this command before the script and you'll get the same 4 sample data frame I used. Also, for this data frame:
d<- 2
maxPredict<- 1
And here is the file for the larger 40 samples data frame that works for my script - http://www.sharecsv.com/s/39188c26e1f17e32868bfe799be0a942/big_final_frame.csv
maxPredict <- 3
d <- 2
Thank you