2

My weekly dataset has different state_id associated with different cities.Value1 and value2 need to be aggregated to monthly level and then quarterly level.So am trying below code:

library(dplyr)  

df <- dataset %>%
      group_by(state_id,city_id) %>%
      group_by(three_month = round_date(weekly_dt, "quarter")) %>%   
      summarise_at(vars(starts_with('value')), mean)

But its popping out this error

 Error in mutate_impl(.data, dots) : 
      Column `three_month` must be length 1 (the group size), not 3766742

Note : All cities don't have same level of weekly data that is the reason I used group_by first. Can someone help me in R. EDIT :my dat

structure(list(city_id = c("B02", "B02", "B02", 
"B02", "B02", "B02"), state_id = c(609L, 609L, 
609L, 609L, 609L, 609L), weekly_dt = structure(c(17601, 
17545, 17447, 17727, 17510, 17664), class = "Date"), value1 = c(0.194669883125, 
0.35, 0.35, 0.124875972916667, 0.35, 0.140909438125), value2 = c(0.203018924883721, 
0.35, 0.35, 0.35, 0.35, 0.35)), class = c("data.table", "data.frame"
), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x0000000004541ef0>)
riyan
  • 75
  • 1
  • 10
  • I think you need to switch the order. Make `three_month` in a `mutate` statement first, and then `group_by` all the variables `state_id, city_id, three_month` after that. – thelatemail Sep 10 '18 at 01:03
  • But it automatically inserting the values where I don't have the input weekly data and getting the quarterly data.@thelatemail.Can you help me – riyan Sep 10 '18 at 01:10
  • Please share sample of your data using `dput()` (not `str` or `head` or picture/screenshot) so others can help. See more here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 – Tung Sep 10 '18 at 01:29

1 Answers1

1

The mutate function adds additional columns to the data frame, which can then be referenced in a group_by. floor_date instead of round_date may be better here because all dates within the quarter will be placed in the same quarter.

library(dplyr)  
library(lubridate)

df <- dataset %>%
  mutate(three_month = floor_date(weekly_dt, "quarter")) %>%
  group_by(state_id, city_id, three_month) %>%
  summarise_at(vars(starts_with('value')), mean)

# A tibble: 4 x 5
# Groups:   state_id, city_id [?]
# state_id city_id three_month value1 value2
#      <int> <chr>   <date>       <dbl>  <dbl>
# 1      609 B02     2017-10-01   0.350  0.350
# 2      609 B02     2018-01-01   0.272  0.277
# 3      609 B02     2018-04-01   0.141  0.350
# 4      609 B02     2018-07-01   0.125  0.350
Andrew Lavers
  • 4,328
  • 1
  • 12
  • 19