TL;DR: Welcome to shared environments and Rcpp's use of pointers.
There are two ways to address R's treatment of objects in a shared environment to avoid the domino update associated with the function.
- perform a deep copy of the object and then update the matrix list position or
- have only unique elements inside the list (e.g. no duplicated matrices)
To begin, let's look at the memory allocation for this shared memory list. For convenience -- and to avoid tracemem()
-- we'll use the lobstr
package to explore the shared environment. The package is currently GitHub-only and can be obtained using:
install.packages("devtools")
devtools::install_github("r-lib/lobstr")
With lobstr
in hand, let's take a look at the underlying memory addresses of the matrix list in R...
x = matrix(list(matrix(0,3,2)),2,2)
lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# identical memory addresses for all objects
Notice that all objects share the same memory location. Thus, R is treating each of these elements inside the list as being the same. R opts for this behavior to reduce the size of data in memory as each element belongs to the same shared environment.
This is particularly problematic for Rcpp as it is manipulating pointers, e.g. a location in memory that shows where a value is stored, and not values, e.g. a variable that holds a value like an int
or a double
.
Due to this pointer behavior, two actions occur:
- all of the matrices in the matrix list are updated simultaneously, and
- the overall object
x
is updated without requiring a return statement.
If we modify your function slightly, the second point is more apparent.
Modified Function to Illustrate Pointers
Note: This function no longer has a return type, yet we are still seeing the x
object change in R's environment.
#include<Rcpp.h>
// [[Rcpp::export]]
void ListMatrixType_pointer(Rcpp::ListMatrix x){
Rcpp::NumericMatrix a = x(0, 0);
a(0, 0) = 100;
}
Output
ListMatrixType_pointer(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2
Notice, we do not have to return a value as x
is automatically updated. Furthermore, we still have the same memory location for each of the elements, e.g.
lobstr::obj_addrs(x)
# [1] "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# identical memory addresses for all objects
Deep Copies of Matrix
To get around the identical memory addresses, you can deeply clone the object and then save it back into the ListMatrix
. The clone()
function instantiates a new block of memory to store the values to. Thus, modifying one element no longer will trigger a domino update. Furthermore, when you add a cloned object back to a list, the memory address changes for only that element.
Using clone()
to make a deep copy
#include<Rcpp.h>
// [[Rcpp::export]]
void ListMatrixType_clone(Rcpp::ListMatrix x){
Rcpp::NumericMatrix a = x(0, 0);
// Perform a deep copy of a into b
// and, thus, changing the memory address
Rcpp::NumericMatrix b = Rcpp::clone(a);
b(0, 0) = 100;
// Update x with b
x(0, 0) = b;
}
Output
ListMatrixType_clone(x)
str(x)
# List of 4
# $ : num [1:3, 1:2] 100 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# $ : num [1:3, 1:2] 0 0 0 0 0 0
# - attr(*, "dim")= int [1:2] 2 2
lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^ ^^^^^^^
# different memory addresses
Unique Elements in a List
To emphasize the need for unique elements, consider modifying the matrix in the first position of the matrix list, the memory address for only the first element will change; the rest of the addresses will remain the same. e.g.
x[[1, 1]] = matrix(1, 2, 2)
lobstr::obj_addrs(x)
# [1] "0x7fceb811a7e8" "0x7fceb69757c8" "0x7fceb69757c8" "0x7fceb69757c8"
# ^^^^^^^ ^^^^^^^
# different memory addresses
Additional Resources
For further reading on this topic please see any of the linked posts below that emphasize pointer behavior associated with Rcpp (Disclaimer: I wrote them):
Miscellaneous note on Object Sizes
Note: object.size()
from utils
in Base R provides an incorrect calculation of the shared environment sizes. c.f. ?utils::object.size
This function merely provides a rough indication: it should be reasonably accurate for atomic vectors, but does not detect if elements of a list are shared, for example. (Sharing amongst elements of a character vector is taken into account, but not that between character vectors in a single object.)
lobstr::obj_size(x)
# 424 B
utils::object.size(x)
# 1304 bytes
So, due to the shared environment, the object size of the list is about 1/3.