I have created a questionnaire. This questionnaire is composed of four sub-scales measuring 4 different components of my variable of interest. Each subscale is composed of 3 items. Each item is a 6-point scale (then responses for each item are comprised between 1 and 6).
Here is a sample of my data, each row is a subject :
> dput(DF[1:10, 7:18 ])
structure(list(I1 = c(3, 6, 6, 4, 5, 5, 3, 3, 5, 4), I2 = c(3,
5, 5, 6, 4, 5, 2, 5, 5, 4), I3 = c(1, 4, 2, 3, 3, 4, 4, 1, 5,
2), I4 = c(5, 6, 6, 6, 5, 6, 6, 6, 6, 6), I5 = c(5, 6, 5, 5,
6, 6, 5, 6, 5, 5), I6 = c(4, 6, 6, 6, 5, 5, 6, 4, 5, 4), I7 = c(3,
6, 5, 6, 4, 4, 3, 5, 3, 4), I8 = c(4, 6, 5, 5, 4, 4, 3, 5, 3,
5), I9 = c(4, 6, 4, 4, 5, 5, 5, 4, 4, 3), I10 = c(2, 4, 5, 6,
3, 2, 4, 1, 2, 4), I11 = c(3, 3, 4, 6, 4, 6, 5, 5, 2, 3), I12 = c(3,
6, 6, 6, 5, 4, 4, 4, 5, 5)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
217 participants fulfilled this questionnaire (no missing value) and I want to test if my data support my model with a CFA.
Here is my code :
library(lavaan)
model <- "
Factor1 =~ I1 + I2 + I3
Factor2 =~ I4 + I5 + I6
Factor3 =~ I7 + I8 + I9
Factor4 =~ I10 + I11 + I12
"
fit <- cfa(model, data = DF)
summary(fit, fit.measures = TRUE, standardized = TRUE)
But when I run it, I have the following error and I can't understand why. Here is the error message :
lavaan WARNING: the optimizer warns that a solution has NOT been found!
lavaan WARNING: the optimizer warns that a solution has NOT been found!
lavaan WARNING: Could not compute standard errors! The information matrix could not be inverted. This may be a symptom that the model is not identified.
lavaan WARNING: some estimated ov variances are negative
lavaan WARNING: covariance matrix of latent variables
is not positive definite; use lavInspect(fit, "cov.lv") to investigate.
Here what I have with lavInspect:
> lavInspect(fit, "cov.lv")
Factr1 Factr2 Factr3 Factr4
Factor1 7797.062
Factor2 0.248 0.451
Factor3 0.215 0.182 0.289
Factor4 -0.254 -0.159 0.280 9883.238
Knowing that this huge cov for Factor 1 and Factor 4 could be explained by very high variances for I1 ( -7795.413) and I10 (-9881.204) displayed by lavaan, but if I ask directly R for var(DF$I1) and var(DF$I10), the result is very different.
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.I1 -7795.413 NA -7795.413 -4729.827
.I2 1.684 NA 1.684 1.000
.I3 1.535 NA 1.535 1.000
.I4 0.807 NA 0.807 0.641
.I6 1.859 NA 1.859 0.884
.I7 1.370 NA 1.370 0.826
.I8 1.201 NA 1.201 0.832
.I9 1.681 NA 1.681 0.950
.I10 -9881.204 NA -9881.204 -4859.350
.I11 2.215 NA 2.215 1.000
.I12 0.784 NA 0.784 1.000
> var(DF$I1)
[1] 1.683052
> var(DF$I10)
[1] 1.966163
Does any one know why it is not working? Is it because my model doesn't fit enough to my data?
Thank you in advance!