0

Join two tables such that we get all rows from tableA and all columns of matching rows from tableB

I want to use data.table and not data.Frame. Please suggest the fastest method.

tableA <- data.table(column1 = c( 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9))

tableB <- data.table(column1 = c( 1.0, 1.2, 1.5, 1.9), column2 = c( "A", "B", "C", "D"), column3 = c( "AA", "BB", "CC", "DD"))

expected result:

    column1 column2 column3
 1:     1.0       A      AA
 2:     1.1    <NA>    <NA>
 3:     1.2       B      BB
 4:     1.3    <NA>    <NA>
 5:     1.4    <NA>    <NA>
 6:     1.5       C      CC
 7:     1.6    <NA>    <NA>
 8:     1.7    <NA>    <NA>
 9:     1.8    <NA>    <NA>
10:     1.9       D      DD
PavoDive
  • 6,322
  • 2
  • 29
  • 55
VjSwamy
  • 79
  • 4
  • 2
    it will be good if you can provide input data using dput – Hunaidkhan Jun 25 '19 at 04:41
  • am so new to R, pls explain what is dput – VjSwamy Jun 25 '19 at 04:54
  • 1
    Welcome to SO, please see how to ask a good question, there are plenty of posts, the most famous https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example . Get familiarized with R code overall, I had to do some heavy editing to bring your question as readable. – PavoDive Jun 25 '19 at 04:57

2 Answers2

1

You only need to invert the order of the join:

tableB[tableA, on = "column1"]
PavoDive
  • 6,322
  • 2
  • 29
  • 55
1

When working with larger datasets, it's probabaly better to perform an update join, to avoind the creation of new objects...

tableA[tableB, `:=`( column2 = i.column2, column3 = i.column3), on = .(column1)][]

will add column2 and column3 from tableB to tableA, based on the values in column1 of both tables.

Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • 1
    Alternatively: `tableA[, c("column2", "column3") := tableB[.SD, on = "column1", .(column2, column3)]]` – s_baldur Jun 25 '19 at 09:00