I've looked for similar threads but can't find a solution.
I've grouped the below dataset by carrier and created new variables to see average and sum delay times successfully. Now I simply want to arrange the data by avg delay, but when I put the below code in it's returning the same data for every row. Can anyone help me figure out where I went wrong?
Using dplyr package, dataset is "flights", have filtered out the na values using:
filter(!is.na(dep_delay), !is.na(arr_delay)).
I got the data and excercise from section 5.6.7 of this resource http://r4ds.had.co.nz/transform.html#exercises-11
bycarrier %>%
transmute(
arrsum = sum(arr_delay),
arravg = mean(arr_delay),
depsum = sum(dep_delay),
depavg = mean(dep_delay)
) %>%
arrange(desc(arravg))
Returns:
Adding missing grouping variables: `carrier`
Source: local data frame [327,346 x 5]
Groups: carrier [16]
carrier arrsum arravg depsum depavg
<chr> <dbl> <dbl> <dbl> <dbl>
1 F9 14928 21.9207 13757 20.20117
2 F9 14928 21.9207 13757 20.20117
3 F9 14928 21.9207 13757 20.20117
4 F9 14928 21.9207 13757 20.20117
5 F9 14928 21.9207 13757 20.20117
6 F9 14928 21.9207 13757 20.20117
7 F9 14928 21.9207 13757 20.20117
8 F9 14928 21.9207 13757 20.20117
9 F9 14928 21.9207 13757 20.20117
10 F9 14928 21.9207 13757 20.20117
# ... with 327,336 more rows