8

Here is an easy example. I have a a data frame with three dates in it:

Data <- as.data.frame(as.Date(c('1970/01/01', '1970/01/02', '1970/01/03')))
names(Data) <- "date"

Now I add a column consisting of the same entries:

for(i in 1:3){
  Data[i, "date2"] <- Data[i, "date"]
}

Output looks like this:

        date date2
1 1970-01-01     0
2 1970-01-02     1
3 1970-01-03     2

For unknown reasons the class of column date2 is numeric instead of date which was the class of date. Curiously, if you tell R explicitly to use the Date format:

for(i in 1:3){
  Data[i, "date3"] <- as.Date(Data[i, "date"])
}

it doesn't make any difference.

        date date2 date3
1 1970-01-01     0     0
2 1970-01-02     1     1
3 1970-01-03     2     2

The problem seems to be in the use of subsetting [], in more interesting examples where you have two columns of dates and want to create a third one that picks a date from one of the two other columns depending on some factor the same happens.

Of course we can fix everything in retrospect by doing something like:

Data$date4 <- as.Date(Data$date2, origin = "1970-01-01")

but I'm still wondering: why? Why is this happening? Why can't my dates just stay dates when being transferred to another column??

Vincent
  • 677
  • 2
  • 7
  • 19
  • It might be related to this question: http://stackoverflow.com/questions/15996692/cannot-assign-columns-as-date-by-reference-in-data-table although data.table plays no role in my example – Vincent Jul 01 '13 at 14:51

1 Answers1

6

This is not a final solution, but I think that can help to understand.

Here your data :

Data <- data.frame(date = 
                  as.Date(c('2000/01/01', '2012/01/02', '2013/01/03')))

Take this 2 vectors , one typed by default as numeric and the second as Date.

vv <- vector("numeric",3)
vv.Date <- vector("numeric",3)
class(vv.Date) <- 'Date'
vv
[1] 0 0 0
> vv.Date
[1] "1970-01-01" "1970-01-01" "1970-01-01" ## type dates is initialized by the origin 01-01-1970

Now if I try to assign the first element of each vector as you do in the first step of your loop:

vv[1] <- Data$date[1]
vv.Date[1] <- Data$date[1]
vv
[1] 10957     0     0
> vv.Date
[1] "2000-01-01" "1970-01-01" "1970-01-01"  

As you see the typed vector is well created. What happen, when you assign a vector by a scalar value , R try internally to convert it to the type of the vector. To return to your example, When you do this :

You a creating a numeric vector (vv), and you try to assign dates to it:

for(i in 1:3){
  Data[i, "date3"] <- as.Date(Data[i, "date"])
}

If you type your date3 , for example:

Data$date3 <- vv.Date

then you try again

for(i in 1:3){
  Data[i, "date3"] <- as.Date(Data[i, "date"])
}

You will get a good result:

       date      date3
1 2000-01-01 2000-01-01
2 2012-01-02 2012-01-02
3 2013-01-03 2013-01-03
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • 2
    ... so OP just needs to initialize the column with something like `Data$date2 <- as.Date(NA)` before looping. – Matthew Plourde Jul 01 '13 at 15:55
  • This explains almost all the weird situations of this type that I encountered. An other one, outside the scope of this question but related, is covered in the answer to this question: http://stackoverflow.com/questions/13781957/sapply-cannot-handle-date-correctly. I type the link here to make it easier to find again. – Vincent Jul 31 '14 at 15:10
  • regarding the last comment: this answer is even more useful: http://stackoverflow.com/a/7698883/2298323 – Vincent May 22 '17 at 08:08