0

I have a dataframe like this one :

date    value
"2010Q1"  1.200
"2010Q2"  1.203
"2010Q3"  2.111
"2010Q4"  2.165
"2011Q1"  1.455
"2011Q2"  1.356
"2011Q3"  2.056
"2011Q4"  2.134

I would like to sum each quarter of a given year in order to have the total value of the year. I am looking for a function returning :

date   value 
2010   6679
2011   7001

Thanks in advance !

  • Remove Quarter information from `date` and then sum by group - `aggregate(value~date, transform(df, date = sub('Q\\d$', '', date)), sum)` https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group – Ronak Shah Sep 01 '20 at 14:24

2 Answers2

0

You could take the first 4 letters of the date column, and then aggregate:

dat[,"year"] <- substr(dat[,"date"], 1, 4))
aggregate(value ~ year, data=dat, FUN=sum)
Karsten W.
  • 17,826
  • 11
  • 69
  • 103
0

I would probably use dplyr and then tsibble and lubridate to get the quarter format and year format. But you can do it by just subtracting the year string as well.

library(dplyr)
library(tsibble)
library(lubridate)

df <- tibble::tribble(
  ~date   , ~value,
  "2010Q1",  1.200,
  "2010Q2",  1.203,
  "2010Q3",  2.111,
  "2010Q4",  2.165,
  "2011Q1",  1.455,
  "2011Q2",  1.356,
  "2011Q3",  2.056,
  "2011Q4",  2.134
)

df <- df %>% 
  mutate(yearquarter = yearquarter(date),
         year = year(yearquarter))

df %>% 
  group_by(year) %>% 
  summarise(sum_value = sum(value))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>    year sum_value
#>   <dbl>     <dbl>
#> 1  2010      6.68
#> 2  2011      7.00

Created on 2020-09-01 by the reprex package (v0.3.0)

FilipW
  • 1,412
  • 1
  • 13
  • 25