0

I'm new to R programing and am trying to create a polychoric correlation matrix with the polycor package. I got the polycor function running without an error message but something is not right because I'm just returned one number even though I have nine variables. If I use hetcor I get a correlation matrix but all the correlations are specified as Pearson correlations. The variable is ordinal and non-normally distributed (so I have to adjust for non-normality in a subsquent factor analysis); I don't understand why I'm getting Pearson correlations rather than polychoric correlations. The code I've used is below for each function. If anyone has suggestions on how to force hetcorr to give me polychoric correlations, or if anyone knows why polychor is returning a single value, I would appreciate hearing from you. Thanks!

Polycor::polychor (GTP5, ML=FALSE, std.err=FALSE, maxcor=.9999)  
Polycor::hetcor (GTP5, ML=FALSE, std.err=TRUE)
  • Can you share a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? I'm not familiar with the GTP5 dataset – Peter Smittenaar May 10 '19 at 20:12
  • Thanks Peter! I created a dummy dataset just for working through the EFA. The response I get from R from running the polycor code above is: in polycor::polychor(GTP5, ML = FALSE, std.err = FALSE, maxcorr = 0.9999) : unused argument (maxcorr = 0.9999) > polycor::polychor (GTP5, ML=FALSE, std.err=FALSE, maxcor=.9999) [1] 0.04742959 Warning message: In polycor::polychor(GTP5, ML = FALSE, std.err = FALSE, maxcor = 0.9999) : 65 rows with zero marginals removed. – Kevin Jefferson May 10 '19 at 21:11
  • Here are the first few lines of the data set I created. 2 0 2 1 3 0; 1 0 0 0 1 0; 1 0 0 0 0 0; 0 0 0 0 0 0; 0 0 0 0 0 0 The formating between how I tyoe this and how it appears is different but basically I have rows representing individuals and columns representing scale items. – Kevin Jefferson May 10 '19 at 21:15

1 Answers1

0

polycor::polychor documentation suggests you need to give two ordered categorical variables, or a contingency table of counts. So it doesn't automatically calculate pairwise correlations for each variable. It's probably parsing your dataframe as one large contingency table.

Usage:

     polychor(x, y, ML = FALSE, control = list(), std.err = FALSE, maxcor=.9999)

Arguments:

       x: a contingency table of counts or an ordered categorical
          variable; the latter can be numeric, logical, a factor, or an
          ordered factor, but if a factor, its levels should be in
          proper order.

       y: if ‘x’ is a variable, a second ordered categorical variable.

You'll have to call this function once for each pair of variables. Here's an SO post on applying functions across all pairwise combinations of columns

p.s. the package is polycor, uncapitalised.

  • Thank you that is a pain but good to know. LOL maybe I'll just create the martix in Stata! The package is polycor. My word processing program where I'm writing code just keeps capitalizing the p. – Kevin Jefferson May 10 '19 at 21:19
  • Sure thing. If my answer answered your question make sure to mark it as the correct answer. – Peter Smittenaar May 10 '19 at 21:34
  • SOrry I don't know if I can open this question back up. I just tried this specifying a contingency table for the correlation of items 1 and 2. I saved this in a csv file called MDI1.MDI2. Below is the contingency table and my code. I do not specify a y variable. If there is something glaringly wrong with my code could you let me know. One Two Three Four Five One 160 5 7 1 1; Two 114 7 6 2 0; Three 112 13 14 5 2; Four 47 4 8 7 1; Five 34 5 7 3 5 – Kevin Jefferson May 11 '19 at 23:26
  • Efrror code is Error in FUN(X[[i]], ...) : only defined on a data frame with all numeric variables. Code is polycor::polychor (MDI12, ML=FALSE, std.err=FALSE, maxcor=.9999). MDI12 is what I assigned the contingency table to when I read it into R. Thank you so much if anyone can help with this. The error code makes no sense to me bc I thought we were supposed to read in a contingency table. I also tried this code polycor::polychor(mdi1, mdi2, ML = FALSE, std.err = FALSE, maxcor=.9999) Where MDI1 and MDI2 are two scale items in a master dataset I read in. – Kevin Jefferson May 11 '19 at 23:30
  • I've done this with x=mdi1 and y=mdi1 too. I keep getting an error code Error in table(x, y) : object 'mdi1' not found. mdi is in there though. I've tried this with caps and no caps, with specifying GTP6 where I read in the master dataset and with calling in the master dataset and then running this code immediately afterwards. I still get this code. Again thank you so much if anyone is able to help! – Kevin Jefferson May 11 '19 at 23:33
  • Hi Kevin, it's hard to say what's wrong when I can't play with the data myself. Can you follow the guidelines on [this page](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), specifically around using `dput(df)` to output data (e.g. your contingency table) and copying that into your original question? Please use `` symbols when pasting data. Check you actually have the variables in workspace - `object 'mdi1' not found` really means it just doesn't exist as a variable. – Peter Smittenaar May 13 '19 at 14:12
  • Thank hyou so much! I hope I'm using dput correctly. Here is the continguency table output from dput (MDI12) structure(list(X = structure(c(3L, 5L, 4L, 2L, 1L), .Label = c("Five", "Four", "One", "Three", "Two"), class = "factor"), One = c(160L, 114L, 112L, 47L, 34L), Two = c(5L, 7L, 13L, 4L, 5L), Three = c(7L, 6L, 14L, 8L, 7L), Four = c(1L, 2L, 5L, 7L, 3L), Five = c(1L, 0L, 2L, 1L, 5L)), class = "data.frame", row.names = c(NA, -5L )) – Kevin Jefferson May 13 '19 at 16:30
  • Again here is the command I'm trying: "Polycor::polychor (MDI12, ML=FALSE, std.err=FALSE, maxcor=.9999) " – Kevin Jefferson May 13 '19 at 16:35
  • I tried it agin just now and got this error message, which I'm googling now: "Error in FUN(X[[i]], ...) : only defined on a data frame with all numeric variables > " – Kevin Jefferson May 13 '19 at 16:36
  • HEY I GOT IT! I had to specify the dataset again in calling the x and the y variables. This is the code I used "polycor::polychor(GTP6, x=GTP6$mdi1, y=GTP6$mdi2, ML = FALSE, std.err = FALSE, maxcor=.9999) " So now I just have to do this for all the variables pairwise and then I'm good. Thank you so much for helping me out with this! – Kevin Jefferson May 13 '19 at 16:41
  • glad it's working, though I'm surprised it works given the function specification doesn't mention anything about giving a dataframe as first argument. You're sure you're getting the correct value for the correlation, yes? – Peter Smittenaar May 14 '19 at 08:03
  • The values certainly look right. Hopefully that means I'm on the right track! I'll run a spearman correlation matrix too though just to see what that looks like! My factor anlaysis on the other hand is most certaintly not working. LOL. – Kevin Jefferson May 14 '19 at 16:07
  • OK I ran the Spearman correlation and I don't think the polychoric matrix is right afterall. it looks like using fa in the psych package though lets me input raw data and specifiy using a polychoric matrix, so I might not have to compute this by hand. I'm trying to figure this out now. – Kevin Jefferson May 14 '19 at 20:00