1

In R you can easily return the cycle part of a time series object with the cycle() function. eg.

> series <- ts(1:50, frequency = 4, start = 2011)
> cycle(series)
     Qtr1 Qtr2 Qtr3 Qtr4
2011    1    2    3    4
2012    1    2    3    4
2013    1    2    3    4
2014    1    2    3    4
2015    1    2    3    4
2016    1    2    3    4
2017    1    2    3    4
2018    1    2    3    4
2019    1    2    3    4
2020    1    2    3    4
2021    1    2    3    4
2022    1    2    3    4
2023    1    2   

However, I have never been able to figure out a nice clean way to return the "period" part (eg. the year for quarterly data). In most cases, you can do a simple:

> floor(time(series))
     Qtr1 Qtr2 Qtr3 Qtr4
2011 2011 2011 2011 2011
2012 2012 2012 2012 2012
2013 2013 2013 2013 2013
2014 2014 2014 2014 2014
2015 2015 2015 2015 2015
2016 2016 2016 2016 2016
2017 2017 2017 2017 2017
2018 2018 2018 2018 2018
2019 2019 2019 2019 2019
2020 2020 2020 2020 2020
2021 2021 2021 2021 2021
2022 2022 2022 2022 2022
2023 2023 2023 

To get the year, however, I have found that for some data (usually high frequency data), the errors in floating point precision will cause the first time point of one period to return the value of the previous period (eg. it was being stored as something like 2010.9999999 rather than 2011 so floor() returns 2010). We can artificially introduce the problem into the data doing the following:

> seriesprec <- ts(1:50, frequency = 4, start = 2010.999999999999)
> floor(time(seriesprec))
     Qtr1 Qtr2 Qtr3 Qtr4
2011 2010 2011 2011 2011
2012 2011 2012 2012 2012
2013 2012 2013 2013 2013
2014 2013 2014 2014 2014
2015 2014 2015 2015 2015
2016 2015 2016 2016 2016
2017 2016 2017 2017 2017
2018 2017 2018 2018 2018
2019 2018 2019 2019 2019
2020 2019 2020 2020 2020
2021 2020 2021 2021 2021
2022 2021 2022 2022 2022
2023 2022 2023    

Now we see the floating point accuracy is throwing off the returned value even though:

> all.equal(time(seriesprec), time(series))
[1] TRUE

The simplest solution I have found that seems to take care of these edge cases is:

round(time(series) - (cycle(series) - 1)*deltat(series))

but this seems like reasonably complicated code for a very simple task. Particularly when cycle() is a base function, it seems like there should be another base function to return the other half of the time definition.

By the way, I am aware of packages that handle dates and times very nicely, but since a lot of the things I do eventually get wrapped into packages, I would rather not add something like lubridate as a dependency for something that can be solved with one (very cumbersome) line of base R code.

Thank you!

Community
  • 1
  • 1
Barker
  • 2,074
  • 2
  • 17
  • 31
  • `trunc(time(ldeaths))` – d.b Feb 28 '17 at 18:49
  • @d.b The object returned by `time()` for a time series object is not of class `timeDate` so the ability of `trunc` to round to specific time precision does not help. Instead it functions like `floor()` in this context and has the same floating point precision problems as that function. – Barker Feb 28 '17 at 18:57
  • Just add a small amount, `floor(time(series) + eps)` You can use any reasonably small number as `eps`. `eps <- deltat(series) / 2` is a general possibility. – G. Grothendieck Feb 28 '17 at 19:43
  • 1
    @G.Grothendieck, you can also add small amount using `offset`: `floor(time(seriesprec, offset = 0.5))` – d.b Feb 28 '17 at 19:55
  • @d.b your most recent answer seems to be working. Could you please write up along with an explanation of what `offset` is doing so I can accept it as an answer? – Barker Mar 01 '17 at 00:39

1 Answers1

1

One way may be to add a suitably small value to time before taking floor or trunc. As G.Grothendieck mentioned in the comments, deltat(series)/2 can be a suitable small value. And using offset with time can be a way of adding that small value. From ?time

offset

can be used to indicate when sampling took place in the time unit. 0 (the default) indicates the start of the unit, 0.5 the middle and 1 the end of the interval.

Adding offset = 0.5 to time is equivalent to adding deltat(series)/2.

So, you should be able to get the correct period part using

floor(time(seriesprec, offset = 0.5))
d.b
  • 32,245
  • 6
  • 36
  • 77