I am loading some JSON data using jsonlite
which is resulting in some nested data similar (in structure) to the toy data.table
dt
constructed below. I want to be able to use rbindlist
to bind the nested data.table
s together.
Setup:
> dt <- data.table(a=c("abc", "def", "ghi"), b=runif(3))
> dt[, c:=list(list(data.table(d=runif(4), e=runif(4))))]
> dt
a b c
1: abc 0.2623218 <data.table>
2: def 0.7092507 <data.table>
3: ghi 0.2795103 <data.table>
Using the NSE built into data.table
, I can do:
> rbindlist(dt[, c])
d e
1: 0.8420476 0.26878325
2: 0.1704087 0.59654706
3: 0.6023655 0.42590380
4: 0.9528841 0.06121386
5: 0.8420476 0.26878325
6: 0.1704087 0.59654706
7: 0.6023655 0.42590380
8: 0.9528841 0.06121386
9: 0.8420476 0.26878325
10: 0.1704087 0.59654706
11: 0.6023655 0.42590380
12: 0.9528841 0.06121386
which is exactly what I expect/want. Furthermore, the original dt
remains unmodified:
> dt
a b c
1: abc 0.2623218 <data.table>
2: def 0.7092507 <data.table>
3: ghi 0.2795103 <data.table>
However, when manipulating the data.table
within a function I generally want to use get
with string column names:
> rbindlist(dt[, get("c")])
V1 V2
1: 0.8420476 0.26878325
2: 0.1704087 0.59654706
3: 0.6023655 0.42590380
4: 0.9528841 0.06121386
5: 0.8420476 0.26878325
6: 0.1704087 0.59654706
7: 0.6023655 0.42590380
8: 0.9528841 0.06121386
9: 0.8420476 0.26878325
10: 0.1704087 0.59654706
11: 0.6023655 0.42590380
12: 0.9528841 0.06121386
Now the column names have been lost and replaced by the default "V1" and "V2" values. Is there a way to retain the names?
In the development version (v1.9.5) the problem is worse than simply lost names though. After executing the statement: rbindlist(dt[, get("c")])
the entire data.table
becomes corrupt:
> dt
Error in FUN(X[[3L]], ...) :
Invalid column: it has dimensions. Can't format it. If it's the result of data.table(table()), use as.data.table(table()) instead.
To be clear, the lost names issue happens in both v1.9.4 (installed from CRAN) and v1.9.5 (installed from github), but the corrupt data.table
issue seems to affect v1.9.5 only (as of today - July 8, 2015).
If I were able to stick with the NSE version of things everything runs smoothly. My issue is that sticking with the NSE version would involve writing multiple NSE functions calling each other which seems to get messy pretty fast.
Are there any (non-NSE-based) known work-arounds? Also, is this a known issue?