I am trying to perform a multivariate multiple regression on my data. I am trying to find out if there are any significant effects of any of the independent variables on any of the dependent variables.
I have two independent variables (expertise (three levels), and version (two levels)) and up to six dependent, continuous variables.
Admittedly, I am quite the noob in R, as well as statistics, but I can't seem to find the solution to the following code and accompanying error:
#MULTIVARIATE MULTIPLE REGRESSION --------
m1 <- lm(cbind(subjectDictCount, overlapCount, relativeOverlap, relativeSize, twoGramsCount, twoGramsOverlap) ~ version + expertise, data=manovalijst)
require(car)
summary(Anova(m1))
Error in eigen(qr.coef(SSPE.qr, x$SSPH), symmetric = FALSE) : infinite or missing values in 'x'
My complete dataset counts 67 rows (of which, 4 beginners, 40 experts, and 23 intermediates; and around 50/50 of versionA/versionB). My (sample of) data looks like this:
>> dput(manovalijst[c(1:4, 41:43, 65:67),])
structure(list(version = c("versionB", "versionA", "versionB",
"versionB", "versionA", "versionB", "versionA", "versionB", "versionA",
"versionA"), expertise = c("expert", "expert", "expert", "expert",
"intermediate", "intermediate", "intermediate", "novice", "novice",
"novice"), subjectDictCount = c(12, 53, 52, 33, 38, 27, 23, 40,
23, 24), overlapCount = c(8, 47, 14, 23, 23, 16, 11, 13, 11,
14), relativeOverlap = c(0.666666667, 0.886792453, 0.269230769,
0.696969697, 0.605263158, 0.592592593, 0.47826087, 0.325, 0.47826087,
0.583333333), relativeSize = c(0.184615385, 0.815384615, 0.8,
0.507692308, 0.584615385, 0.415384615, 0.353846154, 0.615384615,
0.353846154, 0.369230769), twoGramsCount = c(11, 52, 51, 32,
37, 26, 22, 39, 22, 23), twoGramsOverlap = c(1, 29, 0, 9, 6,
1, 1, 2, 0, 0)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
I found in this question: Error in eigen(corr) : infinite or missing values in 'x' when making a 'Correlation matrix circles plot' some info that might help me, but I know that there are no NA values in my dataset. I read something about multicollinearity and how it might effect the outcome of this type of analysis. Indeed, there are perfect correlations between subjectDictCount-relativeSize, subjectDictCount-twoGramsCount, and relativeSize-twoGramsCount (because relativeSize and twoGramsCount are built up from subjectDictCount).
Now, what steps do I take?
Do I use a completely different test? Do I test the dependent variables separately from each other? Is it because assumptions have not been met (I thought they did).
I thought I was using the right test, but now I am starting to doubt that, as well as the fact that any explanations of statistics involve a lot of math and strange signs, which are definitely not my forte.
Any help would be appreciated.
Thank you!
EDIT: I forgot to mention I do get partial results using abovementioned code:
Type II MANOVA Tests:
Sum of squares and products for error:
subjectDictCount overlapCount relativeOverlap relativeSize twoGramsCount twoGramsOverlap
subjectDictCount 6712.34026 3738.44675 -13.4031133 103.2667732 6712.34026 1592.54416
overlapCount 3738.44675 4216.51558 58.7752987 57.5145654 3738.44675 2087.59805
relativeOverlap -13.40311 58.77530 2.3526341 -0.2062017 -13.40311 30.77520
relativeSize 103.26677 57.51457 -0.2062017 1.5887196 103.26677 24.50068
twoGramsCount 6712.34026 3738.44675 -13.4031133 103.2667732 6712.34026 1592.54416
twoGramsOverlap 1592.54416 2087.59805 30.7752046 24.5006793 1592.54416 1429.53149
------------------------------------------
Term: version
Sum of squares and products for the hypothesis:
subjectDictCount overlapCount relativeOverlap relativeSize twoGramsCount twoGramsOverlap
subjectDictCount 9.1727837 17.9271598 0.199953482 0.141119749 9.1727837 16.6558442
overlapCount 17.9271598 35.0365895 0.390786278 0.275802458 17.9271598 32.5519481
relativeOverlap 0.1999535 0.3907863 0.004358698 0.003076207 0.1999535 0.3630734
relativeSize 0.1411197 0.2758025 0.003076207 0.002171073 0.1411197 0.2562438
twoGramsCount 9.1727837 17.9271598 0.199953482 0.141119749 9.1727837 16.6558442
twoGramsOverlap 16.6558442 32.5519481 0.363073427 0.256243756 16.6558442 30.2435065
EDIT: I noticed when I take out the variable of relativeSize, the error does not occur. However, I still do not know why.