3

I want to aggregate zoo data in R by two, four or six months periods. There are only two avaliable options for this type of date processing, using:

a) as.yearmon => process daily data grouped by each month

b) as.yearqtr => process daily data grouped by fixed groups of 3 months periods (jan-mar, apr-jun, jul-set and oct-dec).

A minimal example

library(zoo)        
# creating a vector of Dates 
dt = as.Date(c("2001-01-01","2001-01-02","2001-04-01","2001-05-01","2001-07-01","2001-10-01"),
             "%Y-%m-%d")
# the original dates        
dt
[1] "2001-01-01" "2001-01-02" "2001-04-01" "2001-05-01" "2001-07-01" "2001-10-01"

# conversion to monthly data
as.yearmon(dt)
[1] "jan 2001" "jan 2001" "abr 2001" "mai 2001" "jul 2001" "out 2001"

# conversion to quarterly data
as.yearqtr(dt)
[1] "2001 Q1" "2001 Q1" "2001 Q2" "2001 Q2" "2001 Q3" "2001 Q4"

set.seed(0)
# irregular time series
daily_db = zoo(matrix(rnorm(3 * length(dt)),
                    nrow = length(dt),
                    ncol = 3),
             order.by = dt)
daily_db                                                
2001-01-01  1.2629543 -0.928567035 -1.1476570
2001-01-02 -0.3262334 -0.294720447 -0.2894616
2001-04-01  1.3297993 -0.005767173 -0.2992151
2001-05-01  1.2724293  2.404653389 -0.4115108
2001-07-01  0.4146414  0.763593461  0.2522234
2001-10-01 -1.5399500 -0.799009249 -0.8919211

# data aggregated by month
aggregate(daily_db,as.yearmon,sum)
                 V1           V2         V3
jan 2001  0.9367209 -1.223287482 -1.4371186
abr 2001  1.3297993 -0.005767173 -0.2992151
mai 2001  1.2724293  2.404653389 -0.4115108
jul 2001  0.4146414  0.763593461  0.2522234
out 2001 -1.5399500 -0.799009249 -0.8919211

# data aggregated by quarter
aggregate(daily_db,as.yearqtr,sum)
                V1         V2         V3
2001 Q1  0.9367209 -1.2232875 -1.4371186
2001 Q2  2.6022286  2.3988862 -0.7107260
2001 Q3  0.4146414  0.7635935  0.2522234
2001 Q4 -1.5399500 -0.7990092 -0.8919211

I want to define a function like:

as.yearperiod = function(x, period = 6) {...} # convert dates in semesters

To use this way:

# data aggregated by semester
aggregate(base_dados_diaria, as.yearperiod, period = 6, sum)

I expect an result like this one:

                V1         V2         V3
2001 S1  3.538950   1.175599  -2.147845
2001 S2 -1.125309  -0.035416  -0.639698
Marcos Vinicius
  • 151
  • 1
  • 10

2 Answers2

2

Sir, I suggest you to use lubridate package, to deal with custom date intervals. Your task could be easy accomplished applying floor_date, as below:

six_m_interval <- lubridate::floor_date( dt , "6 months" )
# [1] "2001-01-01" "2001-01-01" "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"

aggregate( daily_db , six_m_interval , sum )
#                  V1          V2         V3
# 2001-01-01  3.538950  1.17559873 -2.1478445
# 2001-07-01 -1.125309 -0.03541579 -0.6396977
GoGonzo
  • 2,637
  • 1
  • 18
  • 25
2

Date2period

Date2period inputs a "Date" object and returns a character string representing a period (semester, etc.) depending on the value of argument period which should be a number that is a divisor of 12. Internally it converts to yearmon and then extracts the year and cycle, i.e. month, and from those generates the required string.

Date2period <- function(x, period = 6, sep = " S") {
  ym <- as.yearmon(x)
  paste(as.integer(ym), (cycle(ym) - 1) %/% period + 1, sep = sep)
}

To test the above:

library(zoo)

# inputs
period <- 6
dt <- as.Date(c("2001-01-01","2001-04-01","2001-07-01","2001-10-01"))

Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"

aggregate(daily_db, Date2period, sum)
##                V1        V2          V3
## 2001 S1 0.9367209 -1.125309  2.39888622
## 2001 S2 2.6022286 -1.223287 -0.03541579

period2yearmon, period2Date

Here are additional conversion functions but for the other direction:

period2yearmon <- function(x, period = 6) {
     year <- as.numeric(sub("\\D.*", "", x))
     cyc <- as.numeric(sub(".*\\D", "", x))
     as.yearmon(year + period * (cyc - 1) / 12)
}

period2Date <- function(x, period = 6) as.Date(period2yearmon(x, period))

Here are some tests of these functions. Since converting from Date to period and back to Date gives the date at the beginning of the period that the input date lies in we show the effect in aggregate at the end.

# create a period string
d <- Date2period(dt)
## [1] "2001 S1" "2001 S1" "2001 S2" "2001 S2"

period2yearmon(d)
## [1] "Jan 2001" "Jan 2001" "Jul 2001" "Jul 2001"

period2Date(d)
## [1] "2001-01-01" "2001-01-01" "2001-07-01" "2001-07-01"

aggregate(daily_db, function(x) period2Date(Date2period(x)), sum)
##                   V1        V2          V3
## 2001-01-01 0.9367209 -1.125309  2.39888622
## 2001-07-01 2.6022286 -1.223287 -0.03541579

This could be made more sophisticated by creating S3 objects such as yearmon but for the purposes shown in the question that is not really needed.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341