I have encountered a very strange memory behavior in R when accessing slots from a custom S4-class where I have applied row names to the object.
Imagine I have the S4 class below with two identical objects. The only difference is that for one of the objects I have applied row names via the function rownames()
:
#' TestClass
#'
#' @slot matrixWithRowNames A matrix where rownames() is applied.
#' @slot matrixWithNoRowNames A matrix where rownames() has not been applied
#' @export TestClass
TestClass <- setClass(
"TestClass",
slots = list(
matrixWithRowNames = "matrix",
matrixWithNoRowNames = "matrix"
)
)
#' initialize
#' @name init_TestClass
#' @docType methods
#' @export
setMethod("initialize", signature("TestClass"),
function(.Object) {
nrow=1000
ncol=10000
.Object@matrixWithNoRowNames<-matrix(runif(nrow*ncol),nrow)
.Object@matrixWithRowNames<-.Object@matrixWithNoRowNames
rownames(.Object@matrixWithRowNames)<-c(1:nrow)
return(.Object)
})
When I measure the memory-usage there is a huge difference the first type I operate on either slot. In the example below I initialize the class and then I calculate the column sums of each object. I use peakRAM
to monitor the memory consumption.
testObject<-TestClass()
peakRAM({
colSums_1<-colSums(testObject@matrixWithRowNames)
})
peakRAM({
colSums_2<-colSums(testObject@matrixWithNoRowNames)
})
The output of peakRAM
is show below. When operating on the object with row names the peak ram used is 76.4MB equal to the size of the object. When operating on the object without row names there is almost no RAM usage as expected.
This memory consumption only happens the first time I access a slot with row names.
Does anybody have a answer to why R behave this way? And is there a way to use row names without this issue? In my actual code I have some very large objects that can result in the code crashing because of this behavior.