0

I have a data set containing the energy usage by day (date) from 01 Jan 2016 through to 07 Nov 2017 on a daily basis. One of the fields therein is a flag for non working day (nwd) with values of 0 and 1 indicating whether or not this is a working day.

The structure of the data looks like this :-

Date,usage,avgtemp,nwd
2016-01-01,28.5,105986,1
2016-01-02,29.2,105548,1
.
.
.
2017-11-07,98457,23.5,0

I created a data frame with these values - no problems. I then created 2 other data frames, one with nwd = 1 and other with nwd = 1 for the data set for non working and working days respectively.

I am trying to generate a time series (using zoo or xts package - I am open to either) for each of these 2 data frames so that I can then do the non stationarity tests (adf/pp) on them and then do the arima modelling to build a forecast model of the usage.

Can I use a time series for such data sets where the data is not quite regular because each of these series will have gaps - the work day series may have less than 5 continuous days in a week if there are holidays in between. The same would apply to the non working day series.

I cannot summarize this at a weekly level as I need to forecast them at a daily level and possibly at the half hourly level subsequently. I might even want to do 'ardl' modelling later using 'avgtemp' as one of the regressors.

P.S. Found a post which to some extent is similar to mine but I can't seem to get it going based on the responses there :-

how to convert data frame into time series in R

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Deepak Agarwal
  • 387
  • 1
  • 4
  • 13
  • `read.csv.zoo("myfile") `and `read.zoo(myDataFrame)` will convert such data to zoo. If your analysis functions require regularly spaced values and `z` is a zoo series with dates then `zoo(coredata(z))` will give a zoo series with index values of 1, 2, 3, ... which you can later map back if need be. – G. Grothendieck Dec 04 '17 at 20:07
  • I checked it and it does just that - not that I doubted it. How would I then use this zoo object (which is not padded with the non working or holiday) dates for my arima() modelling in R - because isn't it that it would require the object to be a time series? Or is it preferable to have those missing non working dates padded before using this for modelling purposes. – Deepak Agarwal Dec 05 '17 at 03:24
  • If your analysis function requires a `ts` series then `as.ts(z)` where `z` is a zoo object will create one. – G. Grothendieck Dec 05 '17 at 04:12

0 Answers0