4

I try to implement the example in advanced R to see how many names point to a location. As the author said

Note that if you’re using RStudio, refs() will always return 2: the environment browser makes a reference to every object you create on the command line.

However, for my case, refs() will always return 65535, even I had cleared the global environment

library(pryr)
x <- 1:10
c(address(x), refs(x))

## "0x1d931f32d68" "65535" 

What does this mean?

user20650
  • 24,654
  • 5
  • 56
  • 91
YoRHa
  • 45
  • 2
  • 2
    16 bit integer: 65535 is -1 so maybe it indicates an error – Marichyasana Dec 04 '21 at 02:17
  • 1
    @Ben Bolker, did you check the [ref](https://www.tidyverse.org/blog/2018/12/lobstr/) function from `lobstr`? – Quinten Aug 23 '22 at 14:21
  • 1
    Yes; it doesn't seem to do the same thing as `pryr::refs()`, at least as far as I could tell from a quick glance/experiment. (I am very curious why `refs()` appears to be broken/wrong now -- changes in R internals? -- but it was too much work to dig in and try to figure it out, which is part of why I offered the bounty ...) – Ben Bolker Aug 23 '22 at 14:26

1 Answers1

5

refs uses the named field of a SEXP to determine the number of references. This in itself is already not very precise, as originally it could only take the values 0, 1 or 2. Also, the NAMED procedure to count the references can lead to confusion.

At some point, there was a change in the SEXP header so that the named field can take more bits. If I understand the documentation correctly, this was with R 3.5.0. Currently, the standard for NAMED_BITS is 16 - leading to a maximum of 65535 for the named field.

Another change with R 3.5.0 was the ALTREP project. This leads to a more efficient representation of basic R objects. One of this representation are compact integer vectors. Basically, integer vectors that don't have a gap (e.g. 1, 2, 3, 4 and not 1, 3, 4) are represented only by their start and end value. However, these "compact" representations are marked as non mutable. This is achieved by assigning NAMEDMAX or REFCNTMAX - which is 65535.

This is only the case if you use : or seq (with dispatch to seq.int):

library(pryr)
x <- 1:10
.Internal(inspect(x))
# @0x00000000156ecce8 13 INTSXP g0c0 [REF(65535)]  1 : 10 (compact)
refs(x)
# [1] 65535

and

y <- seq(1, 10)
# @0x000000000521e218 13 INTSXP g0c0 [REF(65535)]  1 : 10 (compact)
refs(y)
# [1] 65535

Here, you can see the reference count (REF field because with R 4.0.0, there was a change away from the NAMED procedure to reference counting) and that the vector is represented in the compact form.

When initialising the vector in a naive way, then the normal representation is used and the reference count is 1:

z <- c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L)
# @0x00000000057c7808 13 INTSXP g0c4 [REF(1)] (len=10, tl=0) 1,2,3,4,5,...
refs(z)
# [1] 1

With R 4.1.3 in a console, not in RStudio.

starja
  • 9,887
  • 1
  • 13
  • 28