In R
you can easily return the cycle
part of a time series object with the cycle()
function. eg.
> series <- ts(1:50, frequency = 4, start = 2011)
> cycle(series)
Qtr1 Qtr2 Qtr3 Qtr4
2011 1 2 3 4
2012 1 2 3 4
2013 1 2 3 4
2014 1 2 3 4
2015 1 2 3 4
2016 1 2 3 4
2017 1 2 3 4
2018 1 2 3 4
2019 1 2 3 4
2020 1 2 3 4
2021 1 2 3 4
2022 1 2 3 4
2023 1 2
However, I have never been able to figure out a nice clean way to return the "period" part (eg. the year for quarterly data). In most cases, you can do a simple:
> floor(time(series))
Qtr1 Qtr2 Qtr3 Qtr4
2011 2011 2011 2011 2011
2012 2012 2012 2012 2012
2013 2013 2013 2013 2013
2014 2014 2014 2014 2014
2015 2015 2015 2015 2015
2016 2016 2016 2016 2016
2017 2017 2017 2017 2017
2018 2018 2018 2018 2018
2019 2019 2019 2019 2019
2020 2020 2020 2020 2020
2021 2021 2021 2021 2021
2022 2022 2022 2022 2022
2023 2023 2023
To get the year, however, I have found that for some data (usually high frequency data), the errors in floating point precision will cause the first time point of one period to return the value of the previous period (eg. it was being stored as something like 2010.9999999 rather than 2011 so floor()
returns 2010). We can artificially introduce the problem into the data doing the following:
> seriesprec <- ts(1:50, frequency = 4, start = 2010.999999999999)
> floor(time(seriesprec))
Qtr1 Qtr2 Qtr3 Qtr4
2011 2010 2011 2011 2011
2012 2011 2012 2012 2012
2013 2012 2013 2013 2013
2014 2013 2014 2014 2014
2015 2014 2015 2015 2015
2016 2015 2016 2016 2016
2017 2016 2017 2017 2017
2018 2017 2018 2018 2018
2019 2018 2019 2019 2019
2020 2019 2020 2020 2020
2021 2020 2021 2021 2021
2022 2021 2022 2022 2022
2023 2022 2023
Now we see the floating point accuracy is throwing off the returned value even though:
> all.equal(time(seriesprec), time(series))
[1] TRUE
The simplest solution I have found that seems to take care of these edge cases is:
round(time(series) - (cycle(series) - 1)*deltat(series))
but this seems like reasonably complicated code for a very simple task. Particularly when cycle()
is a base function, it seems like there should be another base function to return the other half of the time definition.
By the way, I am aware of packages that handle dates and times very nicely, but since a lot of the things I do eventually get wrapped into packages, I would rather not add something like lubridate
as a dependency for something that can be solved with one (very cumbersome) line of base R
code.
Thank you!