1

I am trying to pass a sequence of dates to a dataframe:

DF_1 <- as.data.frame(matrix(ncol=2))            
days <-seq(as.Date("2016-01-01"), as.Date(Sys.time(),"%Y-%m-%d"), by="days")
    
for (i in 1:length(days)) {
      print(days[i])
      DF_1[i,1] <- days[i]
    }

The result of print function is:

[1] "2021-06-23"
[1] "2021-06-24"
[1] "2021-06-25"
[1] "2021-06-26"
[1] "2021-06-27"
[1] "2021-06-28

" But the column 1 in DF1 is:

16801
16802
16803
16804
16805

Why the secuence of dates is changing in the dataframe?

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
mantanam
  • 27
  • 4
  • You should avoid the `for` loop. It's better just to pass in the data to the data.frame constructor: `DF_1 <- data.frame(days = seq(as.Date("2016-01-01"), as.Date(Sys.time(),"%Y-%m-%d"), by="days"))`. Otherwise you have to worry about partially changing column data types. – MrFlick Jun 28 '21 at 00:54
  • This post https://stackoverflow.com/questions/6434663/looping-over-a-date-or-posixct-object-results-in-a-numeric-iterator should give you the reason why this happens along with possible solutions. – Ronak Shah Jun 28 '21 at 02:51

2 Answers2

2

It is better to initialize the 'DF' as

DF_1 <- data.frame(days)
str(DF_1)
'data.frame':   2006 obs. of  1 variable:
 $ days: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...

Or if we still want to use a for loop, initialize with Date class instead of logical (matrix creates the NA row which are logical)

DF_1 <- data.frame(col1 = as.Date(rep(NA, length(days))))

Now, if we do the loop

for (i in 1:length(days)) {
      print(days[i])
      DF_1[i,1] <- days[i]
    }

checking the class

str(DF_1)
'data.frame':   2006 obs. of  1 variable:
 $ col1: Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...

The issue is the coercion of Date to its integer storage values. We can find the behavior also when unlist

unlist(as.list(head(days)))
[1] 16801 16802 16803 16804 16805 16806

or with unclass

unclass(head(days))
[1] 16801 16802 16803 16804 16805 16806

which can be corrected with c in do.call if the input is a list

do.call(c, as.list(head(days)))
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"

or convert the integer back to Date class afterwards by specifying the origin in as.Date

as.Date(unlist(as.list(head(days))), origin = '1970-01-01')
[1] "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" "2016-01-05" "2016-01-06"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks it works. But why is this happening?. Passing with a for should not work anyway? – mantanam Jun 28 '21 at 01:05
  • @Mantanamm have you tried the second solution I posted with `for` loop after initializting data.frame column as `Date` class – akrun Jun 28 '21 at 01:05
  • @Mantanamm `Date` storage mode is `integer` and there is a type switch which happens because your initial data was logical. Also, the input data is just a single row, which gets appended – akrun Jun 28 '21 at 01:06
  • Thanks for the explanation. I was trying to understand the concept more than solving the problem. – mantanam Jun 28 '21 at 01:11
  • @Mantanamm i understand that your problem may be how to concatenate the dates from an already running big loop. With `Dates`, it should be handled more carefully because of its integer storage mode – akrun Jun 28 '21 at 01:12
2

You can also use dplyr to add the dates to your initialized data frame.

library(dplyr)

# Set up your dataframe based on the length of days.
days <-seq(as.Date("2016-01-01"), as.Date(Sys.time(),"%Y-%m-%d"), by="days")
DF_1 <- as.data.frame(matrix(ncol=2, nrow = length(days)))

# Then, add the date data to the first column in the initialized dataframe.
DF_2 <- DF_1 %>%
  dplyr::mutate(V1 = days)

Another option is to use purrr to turn the date data into a tibble. You can rename the column and create a second column if needed.

library(purrr)
library(dplyr)

df <- days %>% 
  purrr::map_df(as_tibble) %>% 
  dplyr::rename(date = 1) %>% 
  dplyr::mutate(V2 = NA)
AndrewGB
  • 16,126
  • 5
  • 18
  • 49