I am trying to generate a polychoric correlation matrix in R-psych for a 227 x 6 data table which I have called nepr. Importing the data from an excel spreadsheet and entering the code:
nepr=as.data.frame(nepr)
attach(nepr)
library(psych)
out=polychoric(nepr)
neprpoly=out$rho
print(neprpoly,digits=2)
generates the following error message:
>Error in if (any(lower > upper)) stop("lower>upper integration
limits"): missing value where TRUE/FALSE needed
>In addition: warning messages:
>1. In polychoric(nepr): The items do not have an equal number
of response alternatives, global set to FALSE.
>2. In qnorm(cumsum(rsum)[-length(rsum)]): NaNs produced
I was expecting the code which I entered to produce a polychoric correlation matrix based on the dataframe nepr and don't know how to interpret/ act on the error messages which I have received.
Can anyone suggest what changes I need to make to the code to address the error messages?
A sample of the dataset is as follows:
structure(list(Balance = c(4, 4, 5, 5, 3, 4, 3, 4, 2, 2, 2, 5,
2, 2, 2, 2, 1, 2, 4, 1), Earth = c(4, 5, 5, 5, 5, 5, 5, 4, 4,
4, 4, 5, 3, 4, 4, 2, 5, 4, 5, 5), Plants = c(2, 2, 2, 3, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 4), Modify = c(2, 2, 1,
1, 2, 2, 2, 2, 4, 2, 4, 2, 4, 2, 2, 2, 2, 2, 2, 2), Growth =
c(2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 4, 1, 4, 2, 2, 4, 4, 4, 1, 2),
Mankind = c(2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1,
1, 1, 2)), row.names = c(NA,20L), class = "data.frame")
The data consists of inputs of Likert scale rankings (ranked 1-5) to the items 'Balance', 'Earth', 'Plants', 'Modify', 'Growth', and 'Mankind'. There are no missing values in any cells of the 227 row x 6 item matrix; Balance, Plants, & Growth all contain the values 1-5; Earth contains the values 2-5 (no ranking of 1 recorded); Mankind contains the values 1-4 (no ranking of 5 recorded). When I ran the original data set (before reversing the valence of the last 3 columns) I was able to get a polychoric matrix with no problems even though the data contained the Earth data as it appears in the nepr data set. I assume that it is not uncommon to have similar data sets from surveys where variables do not necessarily contain the full range of response values.