This is probably simple but I don't know where the problem is.
I have two R data frames that I am trying to left-merge using data.table. Claims data has 100,000 unique IDs and ID data has 60,000 unique IDs.
Claims[ID, on=id]
However after the merge I am getting 100,000 unique IDs. Isn't this the code for a left merge?
When I try ID[Claims, on=id]
, I get 60,000 unique IDs. But this is the code for right merge.
I just want to say that I am a beginner R learner and this question may seem too pedestrian. Please give me some inputs on what is the correct code for a left join so that I am left with the 60k unique IDs from my 'ID' data..
Giving sample code-
id <- data.table(Id = c("A", "B", "C", "C"),
X1 = c(1L, 3L, 5L, 7L),
XY = c("x2", "x4", "x6", "x8")
)
claims <- data.table(Id = c("A", "B", "B", "D","E"),
Y1 = c(1L, 3L, 5L, 7L,8L),
XY = c("y1", "y3", "y5", "y7","y9"))
m <- claims[id, on = "Id"]
length(unique(m$id)) #gives 0
length(unique(claims$id)) #gives 0
length(unique(id$id)) #gives 0