19

I stumbled across this behavior:

x <- 1:5
> tracemem(x)
[1] "<0x12145b7a8>"
> "names<-"(x, letters[1:5])
a b c d e 
1 2 3 4 5 
> x
a b c d e 
1 2 3 4 5  
> y <- 1L
> tracemem(y)
[1] "<0x12587ed68>"
> "names<-"(y,letters[1])
tracemem[0x12587ed68 -> 0x12587efa8]: 
a 
1
> y
[1] 1 

when trying to help someone figure out why in the former case the vector's names are being modified but in the latter they are not.

Clearly, the length one vector is being copied, while the length 5 vector is being modified in place:

> x <- 1:5
> y <- 1L
> .Internal(inspect(x))
@121467490 13 INTSXP g0c3 [MARK,NAM(1)] (len=5, tl=0) 1,2,3,4,5
> .Internal(inspect(y))
@1258d74d8 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1

Why does the length one vector start out its existence with its NAMED property incremented to 2?

In response to @nograpes comment below, I'm seeing this on OS X 10.7.5 and R 3.0.2.

Community
  • 1
  • 1
joran
  • 169,992
  • 32
  • 429
  • 468
  • Well, other than a possible assumption on the part of the writer that scalars would never get a name assigned... – Carl Witthoft Feb 25 '14 at 18:34
  • In Windows, I cannot duplicate this: the vector `x`'s names are *not* modified in place. – nograpes Feb 25 '14 at 18:35
  • 1
    @nograpes The plot thickens. I added my OS and R version to the question. I find it odd that there would be differences between OS for this... – joran Feb 25 '14 at 18:37
  • BTW, since the code `"names<-"(something)` is wickedly twisted at best, why are you doing this? `names(y)<−letters[1]` works just fine. -- I can reproduce your results for `x` and `y` under Win7,R 302. – Carl Witthoft Feb 25 '14 at 18:40
  • Okay, this is weird too: In Windows 7 x64, under RStudio 0.98.501, and R 3.0.2, the names are *not* modified in place, but under plain R, they *are* modified in place. – nograpes Feb 25 '14 at 18:41
  • @CarlWitthoft That was simply the context in which it came up and where I noticed the difference. Either way, the question is why the difference in the NAMED property, which is the source of the copying... – joran Feb 25 '14 at 18:41
  • 2
    maybe relevant: if you initialize `y` to `1:1` or `c(1L)`, you get the same behavior as you do for `x`. Although `identical(1L, 1:1)` and `identical(1L, c(1L))`. – flodel Feb 25 '14 at 18:49
  • @CarlWitthoft It's not wickedly twisted, just calling the replacement function directly to help to isolate the operation. – Gavin Simpson Feb 25 '14 at 18:53
  • 2
    @flodel, looks like there are identical identicals and unidentical identicals (with no apologies to Rumsfeld). – Carl Witthoft Feb 25 '14 at 19:28
  • So am I the only one for whom: `rm(x); x <- 1:5; "names<-"(x, letters[1:5]); x` produces: `[1] 1 2 3 4 5`? This is on windows 7 64 "R version 3.0.2 (2013-09-25)". – BrodieG Feb 25 '14 at 20:22
  • @BrodieG It sounded like nograpes might have seen the same thing on that platform. See if it changes between the R GUI, RStudio, command line, etc. – joran Feb 25 '14 at 20:25
  • I missed @nograpes comment. I can confirm what he saw; it works weird under RStudio, but okay in command line. – BrodieG Feb 25 '14 at 20:31
  • 1
    And how on earth does RStudio have any impact on this? – BrodieG Feb 25 '14 at 20:43
  • 4
    @BrodieG [Here's how](http://stackoverflow.com/questions/15559387/operator-in-rstudio-and-r/15559956#15559956) ;) – Josh O'Brien Feb 25 '14 at 21:31
  • @JoshO'Brien, fascinating – BrodieG Feb 25 '14 at 22:21

1 Answers1

19

Matthew Dowle asked the same question here, and Peter Dalgaard answered thusly:

This is tricky business... I'm not quite sure I'll get it right, but let's try

When you are assigning a constant, the value you assign is already part of the assignment expression, so if you want to modify it, you must duplicate. So NAMED==2 on z <- 1 is basically to prevent you from accidentally "changing the value of 1". If it weren't, then you could get bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.

This may seem exotic, but really, the rationale is exactly the same as it is for incrementing NAM to 2 whenever doing an assignment of the form x <- y.

As discussed here, R supports a "call by value" illusion to avoid at least some unnecessary copying of objects. So, for instance, x <- y really just binds the symbol x to y's value. The danger of doing that without further precautions, though, is that subsequent modification of x would also modify y and any other symbols linked to y. R gets around this by marking y's value as "linked to" (by setting it's NAM=2) as soon as it is assigned (or even potentially assigned) to another symbol.

When you do x <- 1, the 1 is more or less just another y whose value is being linked to the symbol x by the assignment expression. It's just that the potential for mischief arising from subsequent modification of x's value (recalling that at this point, it's just a reference to the value of 1!) is awful to imagine. But, as always with assignments of one symbol to another, R sets NAM=2, and no modifications without actual copying are allowed.

The reason x <- 1:10 is different (as are x <- 1:1, x <- c(1), x <- seq(1), and even x <- -1) is that the RHS is actually a function call, and the result of that function call is what's being assigned to x. In these cases, the value of x is not just a reference to the value of some other symbol; modifying x won't potentially change the value of some other symbol, so there is no need to set NAM=2.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • I think I might just barely follow this. Would you be able to elaborate on exactly what he's implying might go wrong in that for loop...? I think I get it, but it seems almost too crazy to be true. – joran Feb 25 '14 at 19:18
  • 2
    Is Peter suggesting that without this, `z[1] <- 2` could actually change the value of `1` to `2`? Because `1` is intimately linked with `z` due to the first assignment in the loop? – Gavin Simpson Feb 25 '14 at 19:31
  • 2
    @GavinSimpson Yep. I'll work up an edit to my answer that aims to clarify that and to answer joran's query. – Josh O'Brien Feb 25 '14 at 19:34
  • @JoshO'Brien Thanks; That was how I recalled understanding that thread when it originally took place, but seeing it here like that made my head hurt again with the potential implications if it were so. – Gavin Simpson Feb 25 '14 at 19:40