5

Trying to create a time series per hour in R.

I've a data frame collecting the amount of vehicules per hour, it looks as:

> head(df)
# A tibble: 6 x 8
      interval             cars  vans trucks total `mean speed` `% occupation`  hour
      <dttm>              <int> <int>  <int> <int>        <dbl>          <dbl> <int>
    1 2017-10-09 00:00:00     7     0      0     7         7.37             1.     0
    2 2017-10-09 01:00:00    24     0      0    24        16.1              3.     1
    3 2017-10-09 02:00:00    27     0      0    27        18.1              2.     2
    4 2017-10-09 03:00:00    47     3      0    50        31.5              3.     3
    5 2017-10-09 04:00:00   122     1      5   128        48.0             16.     4
    6 2017-10-09 05:00:00   353     6      2   361        66.3             20.     5

> tail(df,1)
    # A tibble: 1 x 8
      interval             cars  vans trucks total `mean speed` `% occupation`  hour
      <dttm>              <int> <int>  <int> <int>        <dbl>          <dbl> <int>
    1 2018-03-15 20:00:00    48     0      2    50         31.5             5.    20

Looking at the answer at starting a daily time series in R that clearly explains how to create a ts by day I've converted this df to a time series as:

ts2Start <- df$interval[1]
ts2End <- df$interval[nrow(df)]
indexPerHour <- seq(ts2Start, ts2End, by = 'hour')

Since we have 365 days in a year and 24h per day, I created the ts as:

> df.ts <- ts(df$total, start = c(2017, as.numeric(format(indexPerHour[1], '%j'))),
+                                frequency=24*365)

where

as.numeric(format(indexPerHour[1], '%j')))

returns 282

In order to validate what I'm doing I checked if the date obtained from the index is the same as the first row in my data frame

head(date_decimal(index(df.ts)),1)

but while my first date/time should be: "2017-10-09 00:00:00 " I'm getting: "2017-01-12 16:59:59 UTC"

It looks as the first index in the df.ts series has started at ~ 282/24

I do not understand what I'm doing wrong. How the start parameter works in ts()?

I also checked the post: How to Create a R TimeSeries for Hourly data

where it is suggested to use xts package.

The issue is that I'm just learning from a book where tslm() is used and xts object does not seem to be supported.

Can I use ts() to create hourly time series ?

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Elena
  • 61
  • 4
  • 1
    I finally found what was the problem: When creating the ts object I must keep the same unit: since frequency was 24*365, the start parameter should be 24*. With this the date/time is almost identical (the difference is 1 hour) – Elena May 19 '18 at 17:55

1 Answers1

0

You should use an xts library instead. For example:

time_index <- seq(from = as.POSIXct("2016-01-01 00:00:00"), 
                  to = as.POSIXct("2018-10-01 00:00:00"), by = "hour")

traff = xts(df, order.by = time_index)```
Dharman
  • 30,962
  • 25
  • 85
  • 135