1

I have a data frame that consists of a date column and corresponding values (so 2 column matrix). I would like to add extra rows to extend the period of the data. My data starts hourly from the beginning of 1990 and i would like to extend it back in time hourly to beginning of 1979, with the corresponding values being NA. Is there a way to do this? Thanks

Ulas Im
  • 35
  • 1
  • 4
  • 1
    Welcome to SO. Please read [(1)](http://stackoverflow.com/help/how-to-ask) how do I ask a good question, [(2)](http://stackoverflow.com/help/mcve) How to create a MCVE as well as [(3)](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610) how to provide a minimal reproducible example in R. Then edit and improve your question accordingly. – Christoph Oct 25 '16 at 18:55

1 Answers1

0

The most general way to intrapolate or extrapolate the sampling frequency of your dataset is to join your existing dataset to the vector of datetimes that you wish your data had. Here is an example:

library(dplyr)
library(lubridate)

df_foo = data_frame(
  # assuming that this is the existing dataset
  Datetime = seq.POSIXt(
    from = as.POSIXct("1990-01-01 00:00:00"),
    to = as.POSIXct("2000-01-01 00:00:00"),
    by = "hour"
  ),
  value = rnorm(
    n = length(
      seq.POSIXt(
        from = as.POSIXct("1990-01-01 00:00:00"),
        to = as.POSIXct("2000-01-01 00:00:00"),
        by = "hour"
      )
    )
  )
)


df_foo_extended = df_foo %>% 
  full_join(
    data_frame(
      Datetime = seq.POSIXt(
        from = as.POSIXct("1979-01-01 00:00:00"),
        to = as.POSIXct("2000-01-01 00:00:00"),
        by = "hour"
      )
    ),
    by = "Datetime"
  )

You do not tell us where your dataset ends, so I have arbitrarily assumed that it ends in 2000.

tchakravarty
  • 10,736
  • 12
  • 72
  • 116