Trying to create a time series per hour in R.
I've a data frame collecting the amount of vehicules per hour, it looks as:
> head(df)
# A tibble: 6 x 8
interval cars vans trucks total `mean speed` `% occupation` hour
<dttm> <int> <int> <int> <int> <dbl> <dbl> <int>
1 2017-10-09 00:00:00 7 0 0 7 7.37 1. 0
2 2017-10-09 01:00:00 24 0 0 24 16.1 3. 1
3 2017-10-09 02:00:00 27 0 0 27 18.1 2. 2
4 2017-10-09 03:00:00 47 3 0 50 31.5 3. 3
5 2017-10-09 04:00:00 122 1 5 128 48.0 16. 4
6 2017-10-09 05:00:00 353 6 2 361 66.3 20. 5
> tail(df,1)
# A tibble: 1 x 8
interval cars vans trucks total `mean speed` `% occupation` hour
<dttm> <int> <int> <int> <int> <dbl> <dbl> <int>
1 2018-03-15 20:00:00 48 0 2 50 31.5 5. 20
Looking at the answer at starting a daily time series in R that clearly explains how to create a ts by day I've converted this df to a time series as:
ts2Start <- df$interval[1]
ts2End <- df$interval[nrow(df)]
indexPerHour <- seq(ts2Start, ts2End, by = 'hour')
Since we have 365 days in a year and 24h per day, I created the ts as:
> df.ts <- ts(df$total, start = c(2017, as.numeric(format(indexPerHour[1], '%j'))),
+ frequency=24*365)
where
as.numeric(format(indexPerHour[1], '%j')))
returns 282
In order to validate what I'm doing I checked if the date obtained from the index is the same as the first row in my data frame
head(date_decimal(index(df.ts)),1)
but while my first date/time should be: "2017-10-09 00:00:00 " I'm getting: "2017-01-12 16:59:59 UTC"
It looks as the first index in the df.ts
series has started at ~ 282/24
I do not understand what I'm doing wrong. How the start parameter works in ts()
?
I also checked the post: How to Create a R TimeSeries for Hourly data
where it is suggested to use xts
package.
The issue is that I'm just learning from a book where tslm()
is used and xts
object does not seem to be supported.
Can I use ts()
to create hourly time series ?