0

i have a data.table :

  data.table(id = c(rep(1,5)),
               year = c(rep(2015,3), rep(2016,2)), 
               class = c(rep("A", 3), rep("B", 2)),
               origin = c("Europe", "Asia", "Africa", "Europe", "Asia"), 
               count = c(30299, 3, 34, 2, 800))

   id year class origin count
1:  1 2015     A Europe 30299
2:  1 2015     A   Asia     3
3:  1 2015     A Africa    34
4:  1 2016     B Europe     2
5:  1 2016     B   Asia   800

but some of the information is not correct. I want to overwrite the region using another data.table:

  data.table(id = c(1),
             year = c(2015,2016), 
             class = c("A", "B"),
             origin = c("Europe", "Asia"))

   id year class origin
1:  1 2015     A Europe
2:  1 2016     B   Asia

so that the original table becomes :

  id year class origin count
1:  1 2015     A Europe 30299
2:  1 2015     A Europe     3
3:  1 2015     A Europe    34
4:  1 2016     B   Asia     2
5:  1 2016     B   Asia   800

i ried doing this with merge and also within, but i am getting errors.

Nneka
  • 1,764
  • 2
  • 15
  • 39

1 Answers1

1

I named your first data.table dt_A, the second one dt_B and assume, you identify the entries in dt_B by id, year and class.

dt_A[dt_B, on = c("id", "year", "class"), origin := i.origin]

returns

   id year class origin count
1:  1 2015     A Europe 30299
2:  1 2015     A Europe     3
3:  1 2015     A Europe    34
4:  1 2016     B   Asia     2
5:  1 2016     B   Asia   800

Basically we did a left join to get the needed data and updated the origin column.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
  • 1
    or `dt_A[, origin := dt_B[.SD, on = c("id", "year", "class"), origin]]` as recommended here: https://stackoverflow.com/a/54313203/4552295 – s_baldur Jun 15 '21 at 10:34