Why object size are different in R?

Question

I wrote the following in RStudio but the result is amazing !!!

rm(list = ls())
require(lobstr)
x <- 1:3
tracemem(x)
y <- x
x[1] <- 4L
obj_size(x)
obj_size(y)

Why they are different?

Tangent: you're using `require` wrong, either use `library` or check require's return value, ref: https://stackoverflow.com/a/51263513/3358272. — r2evans, Dec 26 '20 at 17:17

user2554330 · Answer 1 · 2020-12-26T20:01:23.303

R uses special compact storage for sequences. When you change the first entry, it drops back to the standard storage.

The special storage is actually inefficient for a short sequence like 1:3, but the size would be the same for 1:3000000:

rm(list = ls())
library(lobstr)
x <- 1:3
y <- x
x[1] <- 4L
obj_size(x)
#> 64 B
obj_size(y)
#> 680 B

x <- 1:3000000
y <- x
x[1] <- 4L
obj_size(x)
#> 12,000,048 B
obj_size(y)
#> 680 B

^{Created on 2020-12-26 by the reprex package (v0.3.0)}

It's also quite hard to define the size of objects in R. For example, a sequence of length 3 (or 3 million) takes up 680 bytes, but two of them don't take up twice that:

x <- 1:3
obj_size(x)
#> 680 B
y <- 1:3000000
obj_size(y)
#> 680 B
z <- list(x, y)
obj_size(z)
#> 896 B

^{Created on 2020-12-26 by the reprex package (v0.3.0)}

The size of z would contain the size of the list() container as well as the two objects bound to x and y, but it's still only 216 bytes bigger than each of them. This is because some of the size attributed to x and y is shared: they're both the same kind of special object, so the code to handle that is only stored once.

Why object size are different in R?

1 Answers1