0

enter image description here

This data frame has two columns, Date and sum_pips. I've been trying to group_by each month and find the total sum_pips for each month. For example, all the June's in this data frame equal to a sum of 2700 pips. Hopefully that makes sense

jpsmith
  • 11,023
  • 5
  • 15
  • 36
Fozil8
  • 13
  • 3
  • Welcome to SO! Please see [this link](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on reproducible examples, esp sharing your data using dput() – Desmond Oct 05 '22 at 01:06

2 Answers2

0

One approach would be to use the month() function from the lubridate package, which teases out the months from date-formatted variables as follows:

Sample data

set.seed(123)
df <- data.frame(Date = seq(as.Date("2022/1/1"), by = "day", length.out = 100),
                 sum_pips = rnbinom(100, mu = 55,  size = 0.75))

Code

library(dplyr)
library(lubridate)

df %>% 
  group_by(month(Date)) %>% 
  summarize(sum_all_pips = sum(sum_pips))

Output:

#   `month(Date)` sum_all_pips
#           <dbl>        <dbl>
# 1             1         1387
# 2             2         1663
# 3             3         1783
# 4             4          803
jpsmith
  • 11,023
  • 5
  • 15
  • 36
0

This is effectively a duplicate of Calculate the mean by group, albeit needing to know how to extract the "month" for each row. Assuming your date column is of the appropriate Date class (and not strings), then you can likely do something like:

transform(yourdata, mon = format(Date, format = "%b")) |>
  stats:::aggregate.formula(formula = sum_pips ~ mon, FUN = sum)

or in dplyr,

library(dplyr)
group_by(yourdata, mon = format(Date, format = "%b")) %>%
  summarize(total = sum(sum_pips))
r2evans
  • 141,215
  • 6
  • 77
  • 149