I have a data set containing the energy usage by day (date) from 01 Jan 2016 through to 07 Nov 2017 on a daily basis. One of the fields therein is a flag for non working day (nwd) with values of 0 and 1 indicating whether or not this is a working day.
The structure of the data looks like this :-
Date,usage,avgtemp,nwd
2016-01-01,28.5,105986,1
2016-01-02,29.2,105548,1
.
.
.
2017-11-07,98457,23.5,0
I created a data frame with these values - no problems. I then created 2 other data frames, one with nwd = 1 and other with nwd = 1 for the data set for non working and working days respectively.
I am trying to generate a time series (using zoo or xts package - I am open to either) for each of these 2 data frames so that I can then do the non stationarity tests (adf/pp) on them and then do the arima modelling to build a forecast model of the usage.
Can I use a time series for such data sets where the data is not quite regular because each of these series will have gaps - the work day series may have less than 5 continuous days in a week if there are holidays in between. The same would apply to the non working day series.
I cannot summarize this at a weekly level as I need to forecast them at a daily level and possibly at the half hourly level subsequently. I might even want to do 'ardl' modelling later using 'avgtemp' as one of the regressors.
P.S. Found a post which to some extent is similar to mine but I can't seem to get it going based on the responses there :-