R merge behaviour

Question

I came across a merge behaviour that I do not understand.

Let's say that you want to merge a numer of data.frames by a shared column, keep the other columns and rename them using suffixes:

dt1 <- data.frame(ind = 1:10, var1 = 2:11, var2 = 3:12)
dt2 <- data.frame(ind = 1:10, var1 = 5:14, var2 = 6:15)
dt3 <- data.frame(ind = 1:10, var1 = 7:16, var2 = 8:17)
dt4 <- data.frame(ind = 1:10, var1 = 10:19, var2 = 11:20)

dt_all <- dt1 %>%
    merge(dt2, by = "ind", all = TRUE, suffixes = c(".1", ".2")) %>%
    merge(dt3, by = "ind", all = TRUE, suffixes = c("", ".3")) %>%
    merge(dt4, by = "ind", all = TRUE, suffixes = c("", ".4"))

This is what I get:

> head(dt_all, 1)
  ind var1.1 var2.1 var1.2 var2.2 var1 var2 var1.4 var2.4 
1   1      2      3      5      6    7    8     10     11

I expected to get var1.3 var2.3 instead of var1 var2. The result does not change if I do not use the dplyr pipes or use data.table with or without seting keys for all the tables. Can anyone explain this?

The column names should be the same for the 'x' and 'y' dataset for the `suffixes` to work. In the first `merge` it worked because both of them have the 'var1', 'var2' column. It is not the case after after the merge, now it is var1.1, var1.2 etc which is getting a nonmatch with var1 var2 from dt3 — akrun, Nov 23 '17 at 09:11
Ah, ok, I get it. Is there a way to force the `suffixes` to work? — MRau, Nov 23 '17 at 09:17

R merge behaviour

0 Answers0