4

I would like to know if there is a 'proper' way to subset big.matrix objects in R. It is simple to subset a matrix but the class always reverts to 'matrix'. This isn't a problem when working with small datasets like this but with massive datasets but with extremely large datasets the subset could still benefit from the 'big.matrix' class.

require(bigmemory)
data(iris)
# I realize the warning about factors but not important for this example
big <- as.big.matrix(iris)

class(big)
[1] "big.matrix"
attr(,"package")
[1] "bigmemory"

class(big[,c("Sepal.Length", "Sepal.Width")])
[1] "matrix"

class(big[,1:2])
[1] "matrix"
cdeterman
  • 19,630
  • 7
  • 76
  • 100

2 Answers2

5

I have since learned that the 'proper' way to subset a big.matrix is to use sub.big.matrix although this is only for contiguous columns and/or rows. Non-contiguous subsetting is not currently implemented.

sm <- sub.big.matrix(big, firstCol=1, lastCol=2)
cdeterman
  • 19,630
  • 7
  • 76
  • 100
-1

It doesn't seem to be possible without calling as.big.matrix on the subset.

From the big.matrix documentation,

If x is a big.matrix, then x[1:5,] is returned as an R matrix containing the first five rows of x.

I presume this also applies to columns as well. So it seems you would need to call

a <- as.big.matrix(big[,1:2])

in order for the subset to also be a big.matrix object.

class(a)
# [1] "big.matrix"
# attr(,"package")
# [1] "bigmemory"
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245