1

I have the following task for the "planes" data from the "nycflights13" package:

For each engine and its manufacturer, determine the production year: earliest (minimum), medium and latest (maximum).

I've been trying to solve this with table() or tapply() but couldn't find a solution. Any suggestions??

library(nycflights13)
data(planes)
head(planes)
  tailnum  year type                    manufacturer     model     engines seats speed engine   
  <chr>   <int> <chr>                   <chr>            <chr>       <int> <int> <int> <chr>    
1 N10156   2004 Fixed wing multi engine EMBRAER          EMB-145XR       2    55    NA Turbo-fan
2 N102UW   1998 Fixed wing multi engine AIRBUS INDUSTRIE A320-214        2   182    NA Turbo-fan
3 N103US   1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214        2   182    NA Turbo-fan
4 N104UW   1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214        2   182    NA Turbo-fan
5 N10575   2002 Fixed wing multi engine EMBRAER          EMB-145LR       2    55    NA Turbo-fan
6 N105UW   1999 Fixed wing multi engine AIRBUS INDUSTRIE A320-214        2   182    NA Turbo-fan

KacZdr
  • 1,267
  • 3
  • 8
  • 23

1 Answers1

1

By medium do you mean to get the median? You can try this -

library(dplyr)

planes %>%
  group_by(engines, manufacturer) %>%
  summarise(min = min(year, na.rm = TRUE), 
            medium = median(year, na.rm = TRUE), 
            max = max(year, na.rm = TRUE))

In base R using aggregate -

aggregate(year~engines + manufacturer, planes, function(x) 
  c(min = min(x, na.rm = TRUE), medium = median(x, na.rm = TRUE), 
    max = max(x, na.rm = TRUE)))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Yes, your solution is correct, but it would be best to make it using only base package (forgot to mention it in the post). Useful answer anyway ;) – KacZdr Jul 08 '21 at 11:25
  • You can use `aggregate`. See my updated answer @JacekSzyszko – Ronak Shah Jul 08 '21 at 11:41