2

I look at the "flights" table from the library(nycflights13)package. I try to create a table from it that contains the number of flights per carrier per airport of origin. My initial idea was to count the number of each airline for each airport of origin.

So the table could look like this:

number of flights / carrier / origin
200-AA-JFK
147-AA-ALM (because airlines could have flown off from different airports)
etc...

Frankly, I have no idea how to approach this problem in terms of coding. I started with this simple two-liner:

flights %>%
  count(carrier) 

It shows me the count of each airline. Is it somehow possible to add another count criterion, such as origin, so that the function would count the number of airlines for each origin?

shymilk
  • 96
  • 6

2 Answers2

1

An option is to group_by and then take the sum of 'flights'

library(nycflights13)
library(dplyr)
flights %>% 
  group_by(origin, carrier) %>%
  summarise(nflights = sum(flight), count = n())

If we don't need the sum of 'flight', then use count with multiple columns

flights %>%
   count(origin, carrier)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • What does the sum(flight) do? It simply adds up the flight numbers, right? So leaving it out and just using the count = n() argument would show the number of flights / airline / airport, or? – shymilk Nov 28 '19 at 20:09
  • 1
    @shymilk Yes, you are right. I was not sure what you really wanted. So, I used `sum`, the `n()` would give the frequency i.e. number of rows for each group, which is basically what the `count(origin, carrier)` gives – akrun Nov 28 '19 at 20:10
1

You can use the data.table package and the following command

flights[, .N, .(origin, carrier)]
J.P. Le Cavalier
  • 1,315
  • 7
  • 16