0

I have data in the following format

number_of_tickets : "01-01-2019", "02-01-2019", "03-01-2019"......
date              :  1500       ,  1200       , "2000......

It is past two years of data and I need to computer total tickets opened yearly and monthly, something like below

      Jan     Feb    Mar....
2019  20570   18702  35078

2020  19794   11325  42723......

I am trying to using package lubridate and deplyr to summarize, mutate and many other things but not getting any headsway.

Any help will be appreciated!!!!

Thanks

Sinh Nguyen
  • 4,277
  • 3
  • 18
  • 26
  • 1
    Welcome to SO! Could you post a minimal reproducible example of your data (see [this](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610))? It would also be better to say specifically what's not working instead of just saying that you haven't been able to do it – csgroen Jan 04 '21 at 12:52

2 Answers2

2

I think you are looking for this

df <- data.frame(
  number_of_tickets = c(1500, 1200, 2000, 1000, 2000, 3000),
  date              = c("01-01-2019", "02-01-2019", "03-01-2019",
                        "01-01-2020", "02-01-2020", "03-01-2020"))


df$date <- as.Date(df$date, format = c("%d-%m-%Y"))

head(df)


df$month <- format(df$date, "%m")
df$year  <- format(df$date, "%y")


head(df)


aggregate(number_of_tickets ~ month + year,
          data = df,
          sum)

The output of the last call is

  month year number_of_tickets
1    01   19              4700
2    01   20              6000

HTH

MacOS
  • 1,149
  • 1
  • 7
  • 14
1

with tidyverse and lubridate, you can also proceed

df <- data.frame(
  number_of_tickets = c(1500, 1200, 2000, 1000, 2000, 3000),
  date              = c("01-01-2019", "02-01-2019", "03-02-2019",
                        "01-01-2020", "02-01-2020", "03-02-2020"))

library(lubridate)
library(tidyverse)

df %>% mutate(month = month(as.Date(date, format = "%d-%m-%Y")),
              year = year(as.Date(date, format = "%d-%m-%Y"))) %>%
  pivot_wider(id_cols = "year", names_from = month, values_from = number_of_tickets, values_fn = sum)

# A tibble: 2 x 3
   year   `1`   `2`
  <dbl> <dbl> <dbl>
1  2019  2700  2000
2  2020  3000  3000

with pivottabler library

library(pivottabler)
library(lubridate)

df$date <- as.Date(df$date, format = "%d-%m-%Y")
df$Month <- month(df$date)
df$Year <- year(df$date)


qpvt(df, rows = "Month", 
     columns = "Year", 
     calculations = "sum(number_of_tickets)")

       2019  2020  Total  
1      2700  3000   5700  
2      2000  3000   5000  
Total  4700  6000  10700