1

I wrote the following in RStudio but the result is amazing !!!

rm(list = ls())
require(lobstr)
x <- 1:3
tracemem(x)
y <- x
x[1] <- 4L
obj_size(x)
obj_size(y)

Why they are different?

sourab maity
  • 1,025
  • 2
  • 8
  • 16
tahzibi
  • 77
  • 6
  • Tangent: you're using `require` wrong, either use `library` or check require's return value, ref: https://stackoverflow.com/a/51263513/3358272. – r2evans Dec 26 '20 at 17:17

1 Answers1

3

R uses special compact storage for sequences. When you change the first entry, it drops back to the standard storage.

The special storage is actually inefficient for a short sequence like 1:3, but the size would be the same for 1:3000000:

rm(list = ls())
library(lobstr)
x <- 1:3
y <- x
x[1] <- 4L
obj_size(x)
#> 64 B
obj_size(y)
#> 680 B

x <- 1:3000000
y <- x
x[1] <- 4L
obj_size(x)
#> 12,000,048 B
obj_size(y)
#> 680 B

Created on 2020-12-26 by the reprex package (v0.3.0)

It's also quite hard to define the size of objects in R. For example, a sequence of length 3 (or 3 million) takes up 680 bytes, but two of them don't take up twice that:

x <- 1:3
obj_size(x)
#> 680 B
y <- 1:3000000
obj_size(y)
#> 680 B
z <- list(x, y)
obj_size(z)
#> 896 B

Created on 2020-12-26 by the reprex package (v0.3.0)

The size of z would contain the size of the list() container as well as the two objects bound to x and y, but it's still only 216 bytes bigger than each of them. This is because some of the size attributed to x and y is shared: they're both the same kind of special object, so the code to handle that is only stored once.

user2554330
  • 37,248
  • 4
  • 43
  • 90