Regarding this answer in: What exactly is copy-on-modify semantics in R, and where is the canonical source?
We can see that, at the first time a vector is altered with '[<-'
, R copies the entire vector even if only a single entry is to be modifed. At the second time, however, the vector is altered "in place". This is noticeable without inspecting the address of the objects if we measure the time to create and modify a large vector:
> system.time(a <- rep(1L, 10^8))
user system elapsed
0.15 0.17 0.31
> system.time(a[222L] <- 111L)
user system elapsed
0.26 0.08 0.34
> system.time(a[333L] <- 111L)
user system elapsed
0 0 0
Note that there is no change of type/storage.mode.
So the question is: why is it not possible to optimize the first bracket assignment as well? In what situation this kind of behaviour (full copy at first modification) is actually needed?
EDIT: (spoiler!) As explained in the accepted answer below, this is nothing but an artifact of enclosing the first assignment in a system.time
function call. This causes R to mark the memory space bound to a
as possibly referring to more than one symbol, thus requiring duplication when changed. If we remove the enclosing calls, the vector is modified in place from the very first bracket assignment.
Thanks Martin for in-depth solution!