2

I know there is a lot of information in Google about this problem, but I could not solve it. I have a data frame:

> str(myData)
'data.frame':   1199456 obs. of  7 variables:  
$ A: num  3064 82307 4431998 1354 193871 ...  
$ B: num  6067 403916 2709997 2743 203434 ...  
$ C: num  299 11752 33282 170 2748 ...  
$ D: num  105 6676 7065 20 1593 ...  
$ E: num  8 572 236 3 170 ...  
$ F: num  0 21 95 0 13 ...  
$ G: num  583 18512 961328 348 42728 ...

Then I convert it to a matrix in order to apply the Cramer-von Mises test from "cramer" library:

> myData = as.matrix(myData)
> str(myData)
 num [1:1199456, 1:7] 3064 82307 4431998 1354 193871 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:1199456] "8" "32" "48" "49" ...
  ..$ : chr [1:7] "A" "B" "C" "D" ...

After that, if I apply a "cramer.test(myData[x1:y1,], myData[x2:y2,])" I get the following error:

Error in rep(0, (RVAL$m + RVAL$n)^2) : invalid 'times' argument
In addition: Warning message:
In matrix(rep(0, (RVAL$m + RVAL$n)^2), ncol = (RVAL$m + RVAL$n)) :
NAs introduced by coercion

I also tried to convert the data frame to a matrix like this, but the error is the same:

> myData = as.matrix(sapply(myData, as.numeric))
> str(myData)
 num [1:1199456, 1:7] 3064 82307 4431998 1354 193871 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:7] "A" "B" "C" "D" ...
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
ibci
  • 63
  • 1
  • 1
  • 8

1 Answers1

3

Your problem is that your data set is too large for the algorithm that cramer.test is using (at least the way it's coded). The code tries to create a lookup table according to

lookup <- matrix(rep(0, (RVAL$m + RVAL$n)^2), 
     ncol = (RVAL$m + RVAL$n))

where RVAL$m and RVAL$n are the number of rows of the two samples. The standard maximum length of an R vector is 2^31-1 on a 32-bit platform: since your samples have equal numbers of rows N, you'll be trying to create a vector of length (2*N^2), which in your case is 5.754779e+12 -- probably too big even if R would let you create the vector.

You may have to look for another implementation of the test, or another test.

Community
  • 1
  • 1
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453