0

I have 2 dataframes and I am using R:

Van Route Time
U31 23452 8:00
U72 13422 7:00
U98 53622 8:00
U24 16622 8:00
U75 76422 6:00
U45 98422 8:00
U27 46422 9:00
Van Seats
U27 20
U72 22
U24 13
U98 16
U75 22
U45 12
U31 22

the problem here is that both data frames are ordered differently and I would like to add the column of "Seats" to my first DF.

This is the result I want:

Van Route Time Seats
U31 23452 8:00 22
U72 13422 7:00 22
U98 53622 8:00 16
U24 16622 8:00 13
U75 76422 6:00 22
U45 98422 8:00 12
U27 46422 9:00 20

So I would like to basically add this column with the condition that DF1$Vans = DF2$Vans!!!

Thanks in advance :))

3 Answers3

2

Perform a simple merge:

newdf = merge(df1, df2, by="Van")

If df2 includes more columns, subset df2 to the desired column and column used to merge:

newdf = merge(df1, df2[,c("Van","Seats")], by="Van")
larenite
  • 217
  • 1
  • 9
  • How do I specify that I want to add the "Seats" column? Actually the second DF has more columns.. I just want the "Seats" column. Thanks and sorry for the inconvenience. – Mateo Guajardo Feb 03 '22 at 20:18
  • @MateoGuajardo I edited my answer to answer your question – larenite Feb 03 '22 at 20:48
1
# set as data.table
lapply(list(df1, df2), \(i) setDT(i))

# join
df2[df1
    , on=.(Vans)
    , mget(c("i.Van", "x.Seats"))
    ]
# assuming df1 has columns "Van", Route", "Time" as OP has not specified
Sweepy Dodo
  • 1,761
  • 9
  • 15
  • How do I specify that I want to add the "Seats" column? Actually the second DF has more columns.. I just want the "Seats" column. Thanks and sorry for the inconvenience. – Mateo Guajardo Feb 03 '22 at 20:17
  • edited to fetch back only columns `Van` and `Seats`. Edit the `mget` argument as needed. Also try to adhere to [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) guide as it will help us help you :) – Sweepy Dodo Feb 03 '22 at 20:25
  • @larenite Thank you for your submission to improve the answer. Your suggestion of `merge` was not my top choice because of speed for which `data.table` is known for. [Please see](https://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r) Furthermore, you have provided the `merge` solution. Still, thank you – Sweepy Dodo Feb 03 '22 at 20:59
0

You can do this:

library(dplyr)
left_join(df1, df2[,c("Van","Seats")], by = "Van")

df2[,c("Van","Seats")] subsets df2 on two columns: Van, which is the column df1 and df2have in common and based on which the join can be performed, as well as Seats, the column you want to add to df1

Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34