21

For example:

A <- 1:10
B <- A

Both A and B reference the same underlying vector.

Before I plow off and implement something in C... Is there a function in R which can test whether two variables reference the same underlying object?

Thanks.

Zach
  • 2,445
  • 3
  • 20
  • 25
  • Why do you need to know if they refer to the same underlying object from R code? – Tommy Sep 06 '11 at 23:03
  • 1
    I'm tallying memory usage, and don't want to count the same allocations more than once. Doing it at the R level probably won't be very accurate, but it should give a reasonable ballpark. – Zach Sep 06 '11 at 23:43
  • 1
    The OP has a valid question, and `identical` doesn't cut it. Just because two things are identical doesn't mean that lazy evaluation isn't occurring. One could create a list of 1000 items that consume only marginally more space than 1 item; or create 1000 items separately that do consume 1000x as much space as one item. – Iterator Sep 06 '11 at 23:47

3 Answers3

19

You can use the .Internal inspect function:

A <- 1:10
B <- A
.Internal(inspect(A))
# @27c0cc8 13 INTSXP g0c4 [NAM(2)] (len=10, tl=0) 1,2,3,4,5,...
.Internal(inspect(B))  # same
# @27c0cc8 13 INTSXP g0c4 [NAM(2)] (len=10, tl=0) 1,2,3,4,5,...
B[1] <- 21
.Internal(inspect(B))  # different
# @25a7528 14 REALSXP g0c6 [NAM(1)] (len=10, tl=150994944) 21,2,3,4,5,...

Simon Urbanek has written a simple package with similar functionality. It's called... wait for it... inspect. You can get it from R-forge.net by running:

install.packages('inspect',repos='http://www.rforge.net/')

UPDATE: A word of warning:

I recommend you use Simon's package because I'm not going to recommend you call .Internal. It certainly isn't intended to be used interactively and it may very well be possible to crash your R session by using it carelessly.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • 4
    Oooooooh, I just love what shows up for ?.Internal: `Only true R wizards should even consider using this function, and only R developers can add to the list of internal functions.` This sentence is rich. :) (+1 when I can vote in a bit.) – Iterator Sep 06 '11 at 23:58
  • Absolutely Fantastic! Thanks, just what I'm looking for. – Zach Sep 06 '11 at 23:59
  • 2
    @Iterator: I like to pretend to be a wizard. I have a hat and everything. I also run with scissors. – Joshua Ulrich Sep 07 '11 at 00:09
  • @Joshua: When you get a chance, can you take a look @ [this question](http://stackoverflow.com/questions/7327521/reference-for-r-wizards). I want to know where the R wizards publish their secrets. :) – Iterator Sep 07 '11 at 00:10
  • @Joshua Ulrich: Magic scissors? – Zach Sep 07 '11 at 00:11
  • Btw, Simon Urbanek mentioned the package is essentially what's in R, in the comments following [this answer](http://stackoverflow.com/a/9168732/805808). – Iterator Feb 07 '12 at 00:41
  • 3
    Yes, I have ported my `inspect` implementation to core R, and I'm essentially just maintaining the R version of it. Since `inspect` seems to be still used I may re-map it to use the internal version, but you can safely use `.Internal(inspect(x))`. FWIW there is also `R_inspect` at C level (useful in `gdb`). – Simon Urbanek Feb 07 '12 at 01:27
17

You can get at this via tracemem: if these are pointing to the same memory location, they are the same memory objects.

> a = 1:10
> b = a
> tracemem(a)
[1] "<0x00000000083885e8"
> tracemem(b)
[1] "<0x00000000083885e8"
> b = 1:10
> tracemem(b)
[1] "<0x00000000082691d0"
> 

As for why it is very useful to know if they are the same objects: if they are pointing to the same object and there's a delayed evaluation / lazy evaluation / promise then if one object changes, calculations will be suspended while a new block of memory is assigned. In some contexts the delay can be substantial. If there are large objects, then the wait is long while a big block of memory is allocated and copied. Other times it could just be a death by a thousand cuts: lots of little perturbations and delays here and there.

Update (Incorporating Joshua's comment): Be sure to use untracemem(), lest you get lots of output. You can also look at retracemem, though I can't yet comment on its utility for setting the trace.

Iterator
  • 20,250
  • 12
  • 75
  • 111
  • Thanks this looks like it would do it too. I didn't realize that you could use tracemem() this way. – Zach Sep 07 '11 at 00:01
  • 2
    That's a safer solution, but you would probably want to `untracemem` after your comparison, else you'll be notified every time any of the "traced" objects are altered. – Joshua Ulrich Sep 07 '11 at 00:07
  • Safer, but you won't be a wizaRd, just a muggle. :( – Iterator Sep 07 '11 at 00:12
0

I found this question when looking for a function that checks whether a single variable is referenced at all, especially in the context of data.table. Extending the other answers, I believe the following function does this:

is_referenced <- function(x) {
  nom <- as.character(substitute(x))
  ls_ <- ls(parent.frame())
  ls_ <- ls_[ls_ != nom]
  tr <- tracemem(x)
  for (i in ls_) {
    if (identical(x, get(i, envir = parent.frame()))) {
      if (identical(tr, tracemem(get(i, envir = parent.frame())))) {
        untracemem(x)
        untracemem(get(i, envir = parent.frame()))
        print(i)
        return(TRUE)
      } else {
        untracemem(get(i, envir = parent.frame()))
      }
    }
  }
  untracemem(x)
  FALSE
} 

x <- 1:10
y <- x
is_referenced(x)
#> [1] "y"
#> [1] TRUE

z <- 1:10
is_referenced(z)
#> [1] FALSE

y[1] <- 1L
is_referenced(y)
#> [1] FALSE


library(data.table)
DT <- data.table(x = 1)
ET <- DT
is_referenced(DT)
#> [1] "ET"
#> [1] TRUE
is_referenced(ET)
#> [1] "DT"
#> [1] TRUE

ET[, y := 1]
is_referenced(DT)
#> [1] "ET"
#> [1] TRUE

DT <- copy(ET)
is_referenced(DT)
#> [1] FALSE

Created on 2018-08-07 by the reprex package (v0.2.0).

Hugh
  • 15,521
  • 12
  • 57
  • 100