24
x <- 1:10
str(x)
# int [1:10] 1 2 3 4 5 6 7 8 9 10
str(as.double(x))
# num [1:10] 1 2 3 4 5 6 7 8 9 10 
str(as(x, 'double'))
# int [1:10] 1 2 3 4 5 6 7 8 9 10

I'd be surprised if there was a bug in R with something so basic as type conversion. Is there a reason for this inconsistency?

Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113
  • 3
    Related: `class(x); class(x) <- "numeric"; class(x)`. – joran Dec 04 '15 at 17:08
  • @joran This is doesn't surprise me, since both doubles and integers return `TRUE` from `is.numeric`. Note that `class(x) <- 'double'` works. – Matthew Plourde Dec 04 '15 at 17:10
  • The weirdest bit of this (and the part that makes me think it *is* a bug) is that there actually is a `coerce()` method defined for going from `"integer"` to numeric, and it has no effect. See `methods("coerce"); getMethod("coerce", c(from="integer", to="numeric"))`. (The body of the `coerce()` method uses the same ineffectual `class(x) <- "numeric"` assignment that @joran demos above.) If it instead used `storage.mode(x) <- "numeric"`, it would have the expected effect... – Josh O'Brien Dec 04 '15 at 17:10
  • 1
    the `note on names` section of `?numeric` may be related – rawr Dec 04 '15 at 17:11
  • 1
    To expand on @joran's comment: `as(x, 'double')` just does `{class(x) <- "numeric"; x}` (`Class` is converted from `"double"` to `"numeric"` near the top of `as`). As Joran illustrated, calling `class(x) <- "numeric"` on an integer vector doesn't do anything. – Joshua Ulrich Dec 04 '15 at 17:12
  • @JoshuaUlrich Exactly, I wasn't sure on the why, but that's simply what I saw happening when I stepped through `as(x,"double")`. – joran Dec 04 '15 at 17:13
  • @JoshuaUlrich That pinpoints the problem, the fact that `Class` gets converted from `"double"` to `"numeric"` – Matthew Plourde Dec 04 '15 at 17:14
  • 1
    @MatthewPlourde Are you sure? When I reverse that conversion while stepping through `as`, it just kicks back an error saying no method for converting integer to double... – joran Dec 04 '15 at 17:19
  • 7
    **Poll:** vote here if you think this is a bug. – Josh O'Brien Dec 04 '15 at 17:38
  • 3
    **Poll:** vote here if you think it is **not** a bug. – Josh O'Brien Dec 04 '15 at 17:40
  • @JoshO'Brien I'm going to submit this bug under "Accuracy", but perhaps it belongs under "S4methods", "Language", or just "Misc"? – Matthew Plourde Dec 04 '15 at 17:44
  • @MatthewPlourde Actually, I was about to just send a note to R-devel. (Have just written it up.) Is submitting as a bug the preferred route? – Josh O'Brien Dec 04 '15 at 17:52
  • @joran Yeah, I wonder if that's related to why I'm [struggling here](http://stackoverflow.com/questions/34091811/convert-columns-of-arbitrary-class-to-the-class-of-matching-columns-in-another-d?noredirect=1#comment55940208_34091811) , i.e., (to reiterate) `class(as(1L, "numeric"))` returns `"integer"` – rbatt Dec 04 '15 at 17:54
  • 1
    @MatthewPlourde [Here's the R-devel query I just submitted](https://stat.ethz.ch/pipermail/r-devel/2015-December/072079.html) – Josh O'Brien Dec 04 '15 at 17:56
  • @JoshO'Brien good call, probably best to let r-devel respond first. – Matthew Plourde Dec 04 '15 at 17:57
  • @rbatt I was trying to answer your question when I ran into this problem. – Matthew Plourde Dec 04 '15 at 17:57
  • @MatthewPlourde I think this is the problem I faced in my "failed attempt" example. It'll be interesting to see how this plays out. The resolution here might give a tip on a best practice for generic conversion. I'm glad you asked this question! – rbatt Dec 04 '15 at 18:02
  • @JoshO'Brien The bug, if that is how it is viewed, is in coerce.c as detailed in my expanded answer. – James Dec 06 '15 at 14:37
  • @James -- That's what I was referencing (too obliquely, I guess!) in my first comment to this question. I was puzzled that it got no attention until, just now, I realized that you have to first attempt one conversion (e.g. `as(1:4, "numeric")`) before that S4 method gets added to the list. Until then, `getMethod("coerce", c(from="integer", to="numeric"))` fails. Do you have any understanding of how that S4 method creation takes place and how it's triggered by a call to `as(4, "numeric")`. (Really nice job, BTW, on the edit to your answer. +1'd). – Josh O'Brien Dec 06 '15 at 15:37
  • @James FWIW, That dispatch to the S4 method, whose only effect is to do the currently pointless `class(from) <- "numeric"` and then return the unmodified value, is why I feel like this *has* to be a bug. What I don't know is whether the fault lies with the C code, or with some automatic method creation that produces the function returned by `getMethod("coerce", c(from="integer", to="numeric"))` , which "steps in front of" the method that R-core would really like to be used (i.e. the one for signature `c(from="ANY", to="numeric")`). – Josh O'Brien Dec 06 '15 at 15:49
  • @JoshO'Brien `class<-` has the benefit of retaining attributes, which may be of use. But I agree an explicit method for this conversion would be better if only for clarity. The actual behaviour should be changed in `R_set_class` since the function as currently written considers integers as already numeric, which is contrary to the documentation. However, could this cause knock-on effects? – James Dec 06 '15 at 16:42
  • @James @JoshO'Brien it makes sense that `as.numeric` treats integers differently than `as(x, 'numeric')` and `class(x) <- 'numeric'` once you understand that the former is referring to the storage mode and the latter two are referring to the class. This ambiguity is unfortunate, but understandable. However, even though "double" is not technically a class, `class(x) <- 'double'` does convert `x` to `numeric`, so `as(x, 'double')`, which does nothing, should be consistent with this, in the same way that `as(x, 'numeric')` and `class(x) <- 'numeric'` are consistent with one another. – Matthew Plourde Dec 07 '15 at 13:55
  • @MatthewPlourde Have you done `as(1:4, "numeric"); getMethod("coerce", c(from="integer", to="numeric"))`, which displays the code that actually gets dispatched to when you do `as(x, 'numeric')`? If the idea is that `as(..., "numeric")` should do nothing to integer vectors, then why go through the motions of a call to `class(from) <- "numeric"` that that function entails? **That** bit of code is why I think this is a bug; it's got two branches that both do the same thing (i.e. leave the integer vector unchanged). If what you say is all true, why not just make the body be `{from}`?? – Josh O'Brien Dec 07 '15 at 17:34
  • @JoshO'Brien, perhaps so that in the following case, you actually get a conversion to numeric: `x <- 1L;attr(x, 'class') <- 'myclass';str(x);y <- as(x,'numeric',strict=TRUE);class(y);str(y)` – Matthew Plourde Dec 07 '15 at 17:48
  • 1
    @MatthewPlourde -- Actually, that works as it does because it dispatches to the other S4 `coerce` method for conversion to numeric -- the method that's used for everything under the sun *except* for vectors of class `"integer"`. (You can see that by doing `selectMethod("coerce", c("myclass", "numeric"))`. It's the one listed as having signature `from="ANY", to="numeric"` in the listing of methods returned by doing `showMethods("coerce")`.) – Josh O'Brien Dec 07 '15 at 17:55
  • @JoshO'Brien Even though `as` is written to dispatch the `ANY` method to integer vectors with an extra class attribute, perhaps the `integer-numeric` method is written defensively so that extra class information will be always be stripped if strict is TRUE, since that `coerce` method could be retrieved and called outside of `as`. – Matthew Plourde Dec 07 '15 at 18:21
  • @MatthewPlourde I've posted a related question [here](http://stackoverflow.com/q/34141757/980833). Will be interested to hear your thoughts, if any. – Josh O'Brien Dec 07 '15 at 19:40
  • 5
    @rawr Glad to see my R-devel post [got a response](https://stat.ethz.ch/pipermail/r-devel/2015-December/072095.html), and one that was worth the wait ;) – Josh O'Brien Dec 09 '15 at 05:06

2 Answers2

14

as is for coercing to a new class, and double technically isn't a class but rather a storage.mode.

y <- x
storage.mode(y) <- "double"
identical(x,y)
[1] FALSE
> identical(as.double(x),y)
[1] TRUE

The argument "double" is handled as a special case by as and will attempt to coerce to the class numeric, which the class integer already inherits, therefore there is no change.

is.numeric(x)
[1] TRUE

Not so fast...

While the above made sense, there is some further confusion. From ?double:

It is a historical anomaly that R has two names for its floating-point vectors, double and numeric (and formerly had real).

double is the name of the type. numeric is the name of the mode and also of the implicit class. As an S4 formal class, use "numeric".

The potential confusion is that R has used mode "numeric" to mean ‘double or integer’, which conflicts with the S4 usage. Thus is.numeric tests the mode, not the class, but as.numeric (which is identical to as.double) coerces to the class.

Therefore as should really change x according to the documentation... I will investigate further.

The plot is thicker than whipped cream and cornflour soup...

Well, if you debug as, you find out that what eventually happens is that the following method gets created rather than using the c("ANY","numeric") signature for the coerce generic which would call as.numeric:

function (from, strict = TRUE) 
if (strict) {
    class(from) <- "numeric"
    from
} else from

So actually, class<- gets called on x and this eventually means R_set_class is called from coerce.c. I believe the following part of the function determines the behaviour:

...
else if(!strcmp("numeric", valueString)) {
    setAttrib(obj, R_ClassSymbol, R_NilValue);
    if(IS_S4_OBJECT(obj)) /* NULL class is only valid for S3 objects */
      do_unsetS4(obj, value);
    switch(TYPEOF(obj)) {
    case INTSXP: case REALSXP: break;
    default: PROTECT(obj = coerceVector(obj, REALSXP));
    nProtect++;
    }
...

Note the switch statement: it breaks out without doing coercion in the case of integers and real values.

Bug or not?

Whether or not this is a bug depends on your point of view. Integers are numeric in one sense as confirmed by is.numeric(x) returning TRUE, but strictly speaking they are not a numeric class. On the other hand, since integers get promoted to double automatically on overflow, one may view them conceptually as the same. There are two major differences: i) Integers require less storage space - this may be significant for larger vectors, and, ii) when interacting with external code that has greater type discipline conversion costs may come into play.

James
  • 65,548
  • 14
  • 155
  • 193
  • Can you give any insight on joran's first comment, which does not use "double", but seems related? `x <- 1:10; class(x); class(x) <- "numeric"; class(x)` – Frank Dec 04 '15 at 17:49
  • So, is this why `class(as(1L, "numeric"))` doesn't work? I.e., my "failed approach" in [this question about class conversion](http://stackoverflow.com/q/34091811/2343633)? – rbatt Dec 04 '15 at 17:52
  • 2
    @Frank it's funny though that `class(x) <- 'double'` works, even though, as @James points out, "double" is not a class. – Matthew Plourde Dec 04 '15 at 18:01
  • 1
    @Frank Because integer is numeric already. It is not something that really needs to be worried about as integers will be promoted to doubles on overflow in R. Integers are more storage efficient and may be required by C/Fortran code, but otherwise can be considered equivalent. – James Dec 04 '15 at 18:04
  • @James Thanks for this explanation! Does that mean my reasoning/ approach [in this answer](http://stackoverflow.com/a/34094737/2343633) is sound? I can use `as(x, 'classname')` to convert `x` to the class (when method available), and when there's an exception like `"integer"` not going to `"numeric"`, it's OK because the exception is due to consistent promotion when needed. Obviously the distinction could still matter under certain conditions, so good to be aware of it. – rbatt Dec 04 '15 at 18:13
  • @rbatt On further reflection, it is more complicated. I've made an edit, but will complete when I can use R on my desktop rather than on r-fiddle. – James Dec 04 '15 at 18:51
  • @rbatt I think I have found the cause now. See the edit. – James Dec 06 '15 at 14:31
2

as(x,"double"): Methods are pre-defined for coercing any object to one of the basic datatypes. For example, as(x, "numeric") uses the existing as.numeric function. These built-in methods can be listed by showMethods("coerce"). These functions manage the relations that allow coercing an object to a given class.

as.double(x): as.double is a generic function. It is identical to as.numeric. Methods should return an object of base type "double". as.double creates, coerces to or test for a double-precision vector.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
Pontios
  • 2,377
  • 27
  • 32
  • Incorrect. `as(x, "numeric")` uses the `coerce()` S4 method dispatched by the signature `c("integer", "numeric")`, which is **not** `as.numeric()`. – Josh O'Brien Dec 04 '15 at 17:24
  • Type help(as) in your r console and read the paragraph before the "References"...... – Pontios Dec 04 '15 at 17:27
  • 1
    OK, point taken. What I said holds for objects of class `"integer"`, but for objects of all other classes, the method with signature `c("ANY", "numeric")` will be dispatched, doing what the documentation says it does. (Do have a look at `showMethods("coerce")`, as recommended in that same paragraph, to see why `"integer"` class objects are getting treated differently.) – Josh O'Brien Dec 04 '15 at 17:34