I just discovered this bug, only to find that some people are calling it a "feature". This makes rbindlist
NOT like do.call("rbind",l)
as rbind
WILL respect column names. Further, there is no mention of this entirely unexpected behavior in the documentation. Is this really intentional?
Code example:
> library(data.table)
> DT1 <- data.table(a=1, b=2)
> DT2 <- data.table(b=3, a=4)
> DT1
a b
1: 1 2
> DT2
b a
1: 3 4
I would expect that rbind
'ing these would produce columns with a = 1,4 ; b = 2,3. And get that with rbind.data.table
and rbind.data.frame
, though rbind.data.table
produces warnings.
> rbind(DT1, DT2)
a b
1: 1 2
2: 4 3
Warning message:
In data.table::.rbind.data.table(...) :
Argument 2 has names in a different order. Columns will be bound by name for consistency with base. You can drop names (by using an unnamed list) and the columns will then be joined by position, or set use.names=FALSE. Alternatively, explicitly setting use.names to TRUE will remove this warning.
> rbind(as.data.frame(DT1), as.data.frame(DT2))
a b
1 1 2
2 4 3
> do.call('rbind', list(DT1, DT2))
a b
1: 1 2
2: 4 3
Warning message:
In data.table::.rbind.data.table(...) :
Argument 2 has names in a different order. Columns will be bound by name for consistency with base. You can drop names (by using an unnamed list) and the columns will then be joined by position, or set use.names=FALSE. Alternatively, explicitly setting use.names to TRUE will remove this warning.
rbindlist
, however, is happy to silently corrupt the data:
> rbindlist(list(DT1, DT2))
a b
1: 1 2
2: 3 4