2

In R, If I have this data

           date.hour temp
2014-01-05 20:00:00            16
2014-01-06 20:00:00            14
2014-01-06 22:00:00            18

and with seq I can get a sequence of date time

begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
seq(begin, end, by=2*3600)

how I can complete the data to something similar to

           date.hour temp
2014-01-05 00:00:00            NA
2014-01-05 02:00:00            NA
...
2014-01-05 18:00:00            NA
2014-01-05 20:00:00            16
2014-01-05 22:00:00            NA
...
2014-01-06 20:00:00            18
2014-01-06 22:00:00            14
...
2014-01-07 00:00:00            NA
JuanPablo
  • 23,792
  • 39
  • 118
  • 164
  • Does this answer your question? [Insert rows for missing dates/times](https://stackoverflow.com/questions/16787038/insert-rows-for-missing-dates-times) – camille Dec 18 '19 at 15:37

2 Answers2

2

If this is your sample data frame

dd<-data.frame(
    date.hour = structure(c(1388970000, 1389056400, 1389063600), class = c("POSIXct", "POSIXt"), tzone = ""),
    temp = c(16L, 14L, 18L)
)

then you can just merge() that with your sequence

begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
comp<-seq(begin, end, by=2*3600)

merge(data.frame(date.hour=comp), dd, all.x=T)

by setting all.x=T, missing values will be filled with NA.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • ... but If I have many original values, how use line `date.hour = structure(c(1388970000, 1389056400, 1389063600), class = c("POSIXct", "POSIXt"), tzone = "")` – JuanPablo Mar 19 '15 at 19:51
  • I don't understand your question. I just used that syntax to make it easier to make your sample data reproducible. You can import your data how ever you like, just make sure your date values are proper POSIX date values. – MrFlick Mar 19 '15 at 19:57
  • thanks, now I undertand, the original values are not in POSIX, for this reason I convert the seq to string with `comp<- format(seq(begin, end, by=2*3600), "%Y-%m-%d %H:%M:%S")` – JuanPablo Mar 19 '15 at 20:37
2

Or similarly with data.table

Your data (from @rawr)

df <- read.table(header = TRUE, text = "date.hour temp
'2014-01-05 20:00:00'            16
'2014-01-06 20:00:00'            14
'2014-01-06 22:00:00'            18", colClasses = c('POSIXct','numeric'))

Solution

library(data.table)
dt <- data.table(date.hour = seq(begin, end, by=2*3600))
setkey(setDT(df), date.hour)
df[dt]
#               date.hour temp
#  1: 2014-01-05 00:00:00   NA
#  2: 2014-01-05 02:00:00   NA
#  3: 2014-01-05 04:00:00   NA
#  4: 2014-01-05 06:00:00   NA
#  5: 2014-01-05 08:00:00   NA
#  6: 2014-01-05 10:00:00   NA
#  7: 2014-01-05 12:00:00   NA
#  8: 2014-01-05 14:00:00   NA
#  9: 2014-01-05 16:00:00   NA
# 10: 2014-01-05 18:00:00   NA
# 11: 2014-01-05 20:00:00   16
# 12: 2014-01-05 22:00:00   NA
# 13: 2014-01-06 00:00:00   NA
# 14: 2014-01-06 02:00:00   NA
# 15: 2014-01-06 04:00:00   NA
# 16: 2014-01-06 06:00:00   NA
# 17: 2014-01-06 08:00:00   NA
# 18: 2014-01-06 10:00:00   NA
# 19: 2014-01-06 12:00:00   NA
# 20: 2014-01-06 14:00:00   NA
# 21: 2014-01-06 16:00:00   NA
# 22: 2014-01-06 18:00:00   NA
# 23: 2014-01-06 20:00:00   14
# 24: 2014-01-06 22:00:00   18
# 25: 2014-01-07 00:00:00   NA
David Arenburg
  • 91,361
  • 17
  • 137
  • 196