12

This is related to Looping over a Date or POSIXct object results in a numeric iterator

> dates <- as.Date(c("2013-01-01", "2013-01-02"))
> class(dates)
[1] "Date"
> for(d in dates) print(class(d))
[1] "numeric"
[1] "numeric"

I have two questions:

  1. What is the preferred way to iterate over a list of Date objects?
  2. I don't understand Joshua's answer (accepted answer from the question linked above), I'll quote it here: "So your Date vector is being coerced to numeric because Date objects aren't strictly vectors". So how is it determined that Date should be coerced to numeric?
Henrik
  • 65,555
  • 14
  • 143
  • 159
user443854
  • 7,096
  • 13
  • 48
  • 63
  • What I meant by my comment is: `is.vector(dates) # FALSE`, so Dates are not "vectors". Your second question should really be a comment to my previous answer. – Joshua Ulrich Jan 25 '13 at 18:01
  • So clearly `dates` is not a vector, and clearly it is `Date`. But what is it that makes it behave like a vector. What is it that makes it iterable? – user443854 Jan 25 '13 at 18:06
  • The `for` loop coerces it to a vector. – Joshua Ulrich Jan 25 '13 at 18:08
  • 1
    From the help page `?vector`. "For any mode, it [`is.vector`] will return FALSE if x has any attributes except names." In R-speak a "vector" does not mean that it can be accessed by position, but rather that it doesn't have attributes. It specifically states that factors are not vectors and it probably should also have stated that Date and POSIXt classed objects are not either. – IRTFM Jan 25 '13 at 18:09
  • 1
    To answer question 1. You can leave `dates` as a character vector and coerce within the loop or use the `seq_along()` technique noted in the post you linked... (or one of the answers below) – Justin Jan 25 '13 at 18:10
  • So `dates` could be a list of `Date` objects. Why did `for` decide to coerce it to a vector of numbers? Is there such a thing in R as a vector of `Date`s? – user443854 Jan 25 '13 at 18:10
  • Nope, see @DWin's comment above regarding the definition of vectors in R. Dates need attributes, like origin, to have any meaning. Unless you wanted a vector of lists of dates... or something silly like that! `is.vector(c(list(as.Date('2013-01-01')), list(as.Date('2013-01-02'))))`... But I don't think that helps at all – Justin Jan 25 '13 at 18:15
  • 1
    @JoshuaUlrich @DWin I don't think that is a standard definition of vector. Dates are vectors, but the `is.vector` function is confusing - it tells you if you have an atomic vector that does not have attributes. `is.atomic(as.Date("2012-01-01"))` is TRUE so Dates are atomic vectors. – hadley Jan 26 '13 at 14:09
  • @hadley: I agree, which is why I put "vectors" in quotes. It's also why, in the previous question, I said they "aren't _strictly_ vectors" (emphasis added). – Joshua Ulrich Jan 27 '13 at 01:35

1 Answers1

13

There are two issues here. One is whether the input gets coerced from Date to numeric. The other is whether the output gets coerced to numeric.

Input

For loops coerce Date inputs to numeric, because as @DWin and @JoshuaUlrich point out, for loops take vectors, and Dates are technically not vectors.

> for(d in dates) print(class(d))
[1] "numeric"
[1] "numeric"

On the other hand, lapply and its simplifier offspring sapply have no such restrictions.

> sapply( dates, function(day) class(day) )
[1] "Date" "Date"

Output

However! The output of class() above is a character. If you try actually returning a date object, sapply is not what you want.

lapply does not coerce to a vector, but sapply does:

> lapply( dates, identity )
[[1]]
[1] "2013-01-01"

[[2]]
[1] "2013-01-02"

> sapply( dates, identity )
[1] 15706 15707

That's because sapply's simplification function coerces output to a vector.

Summary

So: If you have a Date object and want to return a non-Date object, you can use lapply or sapply. If you have a non-Date object, and want to return a Date object, you can use a for loop or lapply. If you have a Date object and want to return a Date object, use lapply.

Resources for learning more

If you want to dig deeper into vectors, you can start with John Cook's notes, continue with the R Inferno, and continue with SDA.

Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • 1
    `sapply` does, because of `simplify2array` (matrices can only hold atomic vectors). You're returning the class, not the object itself. Try: `sapply(Sys.Date(), identity)`. `lapply` works though. – Joshua Ulrich Jan 25 '13 at 18:05
  • What about `> sapply(dates, class)` that prints `[1] "Date" "Date"`? Shouldn't `sapply` coerce to numeric vector? (according to the post and comment) – user443854 Jan 25 '13 at 18:18
  • 1
    @user443854: as I said, that's returning the class of the object (a character string, which is an atomic vector) as it exists inside the function call; it's not returning a Date. Would you allow me to give you $100 as "$100"? – Joshua Ulrich Jan 25 '13 at 18:21
  • @JoshuaUlrich: class of the object is character string whose value is "Date". But the actual object is numeric. Why the mismatch? – user443854 Jan 25 '13 at 19:10
  • @user443854: I don't know how else I can explain it... `simplify2array` coerces it to numeric, for reasons I've already explained. So `sapply` returns a number, not a Date, even though it's still a Date inside the function. – Joshua Ulrich Jan 25 '13 at 20:19
  • Ok, so I'm really confused now. Was the question how to iterate over a date pseudo-vector (what does one even call this?) without the *input* being coerced to numeric, or how do you *return* a collection of dates? My assumption was the former, since that is what the for loop has trouble with. – Ari B. Friedman Jan 25 '13 at 22:05
  • 2
    @JoshuaUlrich matrices are not limited to atomic vectors: `matrix(list(1, 2, 3, 4), nrow = 2)`. I think the really explanation that either `simplify2array` is poorly written, or the Date class is missing needed S3 methods. – hadley Jan 26 '13 at 14:13
  • @AriB.Friedman are you confused about my question? I can clarify that. I certainly do not need to return a collection of dates, since this is what I start with; `dates` is a list of `Date`s. If your confusion is about pseudo-vectors (the onese @JoshuaUllrich puts in quotes), then I am also still confused. First part of my question was practical, to solve immediate problem. Second was attempt to get a clear picture (still not there). @hadley hints something is poorly written -- it really looks like that. – user443854 Jan 28 '13 at 15:11
  • @user443854 My reading of the question was that you wanted the **Input** part of my answer. It seems like that worked for you? – Ari B. Friedman Jan 28 '13 at 15:25
  • @AriB.Friedman: Yes. But I am still missing coherent explanation of the coercing behavior. If `Date` becomes `numeric`, what about other arbitrary types? It would be nice to hear someone explain the purpose for this behavior (or maybe acknowledge this is a kink in the language). – user443854 Jan 28 '13 at 16:48
  • @user443854 The behavior is entirely consistent: If it's a `vector` (see `is.vector()` to test), then it works. If not, it doesn't. There's more info out there on R fundamental data types. Someone here can probably recommend a book (likely featuring yellow and blue prominently on the cover and written by Chambers) that will teach you more. – Ari B. Friedman Jan 28 '13 at 17:19
  • @AriB.Friedman: It troubles me that `for` does not work (using your language) for non-vectors (e.g. lists). In fact, as we learned, it coerces a list of `Date`s to numeric vector. My question is why? And where is consistency here? (If this is too much to answer here, please point me to a book/chapter to read, I am totally for it!) Thanks. – user443854 Jan 28 '13 at 18:56
  • @user443854 I assume there's some technical reason why relating to the underlying C code. I also assume they never really bothered to fix it because it would a) make things a little slower, and b) loops aren't very R-like in the first place. The `*apply` family commands handle this just fine, and they're what you should be reaching for in the first place. *Edit* added some references for you to dig deeper. – Ari B. Friedman Jan 28 '13 at 19:19
  • @AriB.Friedman I did not bother to buy the book from Amazon, but I did read the first two references and found them... pretty irrelevant. As for `apply`, it is implemented using `for` loop, take a look. As @hadley points out, `Date`s *are* atomic vectors, they just happened to have attributes, so that poorly written `is.vector` returns `FALSE`. – user443854 Jan 31 '13 at 05:27
  • 1
    @AriB.Friedman @user443854 no, it is not consistent, and it is because `for` is written in C and does not look at the class of the object, or use `[` to extract individual components of the vector. – hadley Jan 31 '13 at 12:27