I have two data.table
and I am trying to merge them together.
The first one has the columns: code
offers
The second one has the columns: code
offers
nb_offers
I am trying to add the third column to the first data.table through a merge.
The minimal files are available here:
- dt1 = https://ufile.io/vjeju
- dt2 = https://ufile.io/l61jc
My issue is that for some rows, the values aren't filled while they're obviously here.
A working example:
> library(data.table)
> dt1 <- data.table(readRDS("dt1.rds"))
> dt2 <- data.table(readRDS("dt2.rds"))
> dt1[offers == "0.05"]
code offers
1: 24428 0.05
> dt2[offers == "0.05"]
code offers nb_offers
1: 24428 0.05 5
> merge(dt1[offers == "0.05"], dt2[offers == "0.05"],
by = c("code", "offers"))
code offers nb_offers
1: 24428 0.05 5
Let's change things very slightly:
> dt1[offers == "0.06"]
code offers
1: 24428 0.06
> dt2[offers == "0.06"]
code offers nb_offers
1: 24428 0.06 3
> merge(dt1[offers == "0.06"], dt2[offers == "0.06"],
by = c("code", "offers"))
Empty data.table (0 rows) of 3 cols: code,offers,nb_offers
This makes absolutely no sense to me.
I tried to replicate the error, but if I create the data.tables from scratch, the error doesn't occurr:
> tmp1 <- data.table(code = "24428", offers = 0.06)
> tmp2 <- data.table(code = "24428", offers = 0.06, nb_offers = 3)
> setkey(tmp2, code)
> merge(tmp1, tmp2, by = c("code", "offers"))
code offers nb_offers
1: 24428 0.06 3
> all.equal(tmp1, dt1[offers == "0.06"])
[1] TRUE
> all.equal(tmp2, dt2[offers == "0.06"])
[1] TRUE
I understand I can do what I want to do in other ways, but I'd like to understand what's going on here.
Thank you.