0

I have 2 df.

df_tailnum, a list of unique tailnum

   TailNum
   <chr>  
 1 N657AW 
 2 N834AW 
 3 N605AW 
 4 N312AW 
 5 N309AW 
 6 N733UW 
 7 N177UW 
 8 N404US 
 9 N722UW 
10 N104UW

And df_planes, with information on the names of the tailnum

  tailnum type   manufacturer  issue_date model status aircraft_type engine_type year 
  <chr>   <chr>  <chr>         <chr>      <chr> <chr>  <chr>         <chr>       <chr>
1 N997AT  Corpo… BOEING        01/02/2003 717-… Valid  Fixed Wing M… Turbo-Fan   2002 
2 N997DL  Corpo… MCDONNELL DO… 03/11/1992 MD-88 Valid  Fixed Wing M… Turbo-Fan   1992 
3 N998AT  Corpo… BOEING        01/23/2003 717-… Valid  Fixed Wing M… Turbo-Fan   2002 
4 N998DL  Corpo… MCDONNELL DO… 04/02/1992 MD-88 Valid  Fixed Wing M… Turbo-Jet   1992 
5 N999CA  Forei… CANADAIR      07/09/2008 CL-6… Valid  Fixed Wing M… Turbo-Jet   1998 
6 N999DN  Corpo… MCDONNELL DO… 04/02/1992 MD-88 Valid  Fixed Wing M… Turbo-Jet   1992 

I would like to know how do I add value from year column in df_planes to df_tailnum as a new column base on the corresponding tailnum?

something like this:

   TailNum Year
   <chr>  
 1 N657AW 
 2 N834AW 
 3 N605AW 
 4 N312AW 
 5 N309AW 
 6 N733UW 
 7 N177UW 
 8 N404US 
 9 N722UW 
10 N104UW 

I tried

df_tailnum <- df_planes %>%
  if_else(df_planes$tailnum == df_tailnum$TailNum,select(df_planes$year))
  mutate()

which I realise doesn't make sense.

Please help.

Leo
  • 35
  • 5

1 Answers1

0
  1. if_else is a vector-based function, not a verb for working on the whole data at once. At best it should be within a mutate or summarize call, not on its own. (This is breaking partially because of the %>% dumping a data.frame into its first argument.)

  2. (Almost) Never use df_planes$ within a dplyr pipe.

  3. This is an advanced operation known as "merge" or "join". It's a good one to know, but if you're just starting with data.frames and R (and not yet SQL or similar), then it might be a big. See How to join (merge) data frames (inner, outer, left, right) and What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN? for good discussions on how it works (both in R and in general).

Try these:

dplyr

library(dplyr)
left_join(df_tailnum, df_planes, by = c(TailNum = "tailnum"))
#    TailNum type manufacturer issue_date model status aircraft_type engine_type year
# 1   N657AW <NA>         <NA>       <NA>  <NA>   <NA>          <NA>        <NA>   NA
# 2   N834AW <NA>         <NA>       <NA>  <NA>   <NA>          <NA>        <NA>   NA
### ...

left_join(df_tailnum, select(df_planes, tailnum, year), by = c(TailNum = "tailnum"))
#    TailNum year
# 1   N657AW   NA
# 2   N834AW   NA
# 3   N605AW   NA
### ...

Nothing is found because your sample data has nothing in common between them, but the method is sound.

base R

merge(df_tailnum, df_planes, by.x = "TailNum", by.y = "tailnum", all.x = TRUE)
r2evans
  • 141,215
  • 6
  • 77
  • 149