2

I am trying to validate a psychometric using CFA in R, the scale was measure using 5-point likert scale. 6-factor model, 66 items in the model, N = 200. Here is part of my model:

first.model<-'
Plan=~AS8+PL1+FO8+ID3
Improvement=~IO3+IO8+IO6+IO2+IO4+IO5+AS1+AS2
Influence=~IN4+IN13+IN6+IN15+IN2+IN12+IN7+IN9+IN11+IN8+IN5
Idea=~PR2+A16+O8+PR1+O12+PR11+O4+PR3+O14+O13+A11
Active=~PR8+AS6+AC1+AS7+AC8+AS13+AS10+AC6+AS9+E4+PL4
+A15+PL7+PR12+PR15+E10+AS3
Goal=~GF11+GF4+GF10+GF1+GF13+GF7+GF14+GF6+GF2+GF8
+PL9+GF5+PL10+E7+PL6
'
first.fit<-cfa(first.model, data=NE2, ordered = 
               c("AS8","PL1","FO8","ID3","IO3","IO8","IO6","IO2","IO4",
                 "IO5","AS1","AS2","IN4","IN13","IN6","IN15","IN2","IN12",
                 "IN7","IN9","IN11","IN8","IN5","PR2","A16","O8","PR1",
                 "O12","PR11","O4","PR3","O14","O13","A11","PR8","AS6",
                 "AC1","AS7","AC8","AS13","AS10","AC6","AS9","E4","PL4",
                 "A15","PL7","PR12","PR15","E10","AS3","GF11","GF4","GF10",
                 "GF1","GF13","GF7","GF14","GF6","GF2","GF8","PL9","GF5",
                 "PL10","E7","PL6"),std.lv=T)' 

However, after I run the second part (categorical part), I receive a warning message that says: Warning message: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, : lavaan WARNING: The variance-covariance matrix of the estimated parameters (vcov) does not appear to be positive definite! The smallest eigenvalue (= -9.174795e-17) is smaller than zero. This may be a symptom that the model is not identified.

When I checked for heywood cases there was no negative variances or covariances greater than 1:

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv
   .AS8               0.594                               0.594
   .PL1               0.215                               0.215
   .FO8               0.659                               0.659
   .ID3               0.973                               0.973
   .IO3               0.652                               0.652
   .IO8               0.699                               0.699
   ... (rows omitted)

Covariances:
                            Estimate  Std.Err  z-value  P(>|z|)
  Plan ~~                                                      
    Imprvmnt                0.470    0.060    7.809    0.000
    Influence               0.512    0.060    8.514    0.000
    Idea                    0.688    0.056   12.331    0.000
    Active                  0.696    0.051   13.650    0.000
    Goal                    0.545    0.057    9.558    0.000
... (etc etc)

Any suggestions for steps forward please advice. Thank you!

ZXX
  • 21
  • 1
  • 3

1 Answers1

2

I checked for heywood cases

That is not what the warning message refers to. vcov(first.fit) contains the sampling (co)variances of estimated parameters (not variables). The diagonal contains sampling variances, square-roots of which are the standard errors reported in the Std.Err column. Basically, this message is warning about possible linear dependencies (redundancies) among some combination of model parameters. The reported eigenvalue is effectively zero (perhaps only negative due to lack of machine precision?), so if there are no other signs that the model is not identified, it is probably nothing to be concerned about. Your CFA syntax is quite straight-forward and does not break any identification rules-of-thumb.

I suspect the bigger problem is that you are using only N = 200 cases to reproduce 66*4 (thresholds) + 66*65/2 (polychoric correlations) = 2409 summary statistics. DWLS doesn't require as large N as ADF/WLS (Flora & Curran, 2004), but simulation studies with much smaller models than yours have shown that DWLS still needs more information than N = 200 (e.g., Bandalos, 2014). Given your model size, I would not be comfortable estimating models with N < 1000 using DWLS. If your response distributions are not terribly skewed, I would consider treating them as continuous and using MLR (Rhemtulla et al., 2012) to account for nonnormality.

Terrence
  • 780
  • 4
  • 7
  • 1
    Thank you for your input. Given your explanation, the major problem is the sample size am I correct ? In order to proceed with the analysis, shall I just take out the part of the code where it treats them as categorical variables then? Or is there some other codes I should run please? – ZXX Nov 11 '21 at 21:48
  • 1
    You can replace the `ordered=` argument with something like `estimator="MLR"`. See the `?lavOptions` help page for details. – Terrence Nov 17 '21 at 15:35