0

I have a dataset with a customer name listed multiple times. I am hoping to merge the same customer name by month while getting the sum of the supporting variables. I like to use dplyr, but am having trouble summing the supporting variables (dep_delay & arr_delay in example). I have used a reprex below where carrier acts as a customer name. Thank you for taking the time to look at this example!

Ideally the output would look something like this:

carrier / month / dep_delay / arr_delay

AA / 1 / 3412 / 12234

UA / 1 / 1517 / 2594

AA / 1 / 12342 / 1231

UA / 1 / 121 / 1234

#The code is listed below

library(tidyverse)
library(readr)
library(lubridate)
library(nycflights13)

flights_updated <- flights[,c(10,2,6,9)]
flights_updated <- group_by(flights_updated, carrier, month) %>% 
summarise (dep_delay = sum(dep_delay), arr_delay = sum(arr_delay)) 

I have also tried this as alternatives:

I have tried the following lines of code to no avail as well:

flights_updated <- flights_updated %>% group_by(carrier, month) %>% summarise_at(vars(dep_delay, arr_delay), sum)

aggregate(cbind(dep_delay, arr_delay) ~ carrier + month, data = flights_updated, sum, na.rm = TRUE)

DonnyDolio
  • 89
  • 7

1 Answers1

0

After waiting the weekend for guidance, I was able to find an answer from @Talat which helped provide the guidance needed. How to sum a variable by group

#Load packages
library(tidyverse)
library(dplyr)
library(readr)
library(lubridate)
library(nycflights13)

flights_updated <- flights[,c(10,2,6,9)]

flights_updated <- flights_updated %>% 
  group_by(carrier, month) %>% 
  summarise(dep_delay = sum(dep_delay), arr_delay = sum(arr_delay))

flights_updated
DonnyDolio
  • 89
  • 7