2

Can a non-equi join be done using the "on" argument in data.table when one or more column names contain spaces?

I understand that syntactically invalid names should be avoided and can be corrected (e.g., How to deal with spaces in column names?) but beyond that, wrapping in backticks doesn't seem to work in data.table on argument.

For example, to perform a non-equi join of the two tables below:

library(data.table) 

dt1 <- data.table(x = c("a", "b", "c"), y = c(2, 1, 3))

dt2 <- data.table(z = c("r", "g", "b"),
                  "y min" = seq(0.9, 2.9, 1), 
                  "y max" = seq(1.1, 3.1, 1))

#join dt1 and dt2 where y is in the range ('y min', 'y max')
dt1[dt2, on = .(y >= `y min`, y <= `y max`), z := i.z]

Fails with

##Error in `[.data.table`(dt1, dt2, on = .(y >= `y min`, y <= `y max`),  : 
##  Column(s) [`y min`,`y max`] not found in i

Desired output (made by same join with syntactically valid names):

dt1[dt2, on = .(y >= ymin, y <= ymax), z := i.z]
##> dt1
##   x y z
##1: a 2 g
##2: b 1 r
##3: c 3 b
Chris Holbrook
  • 2,531
  • 1
  • 17
  • 30
  • 3
    Try `dt1[dt2, on = c("y>=y min", "y<=y max"), z := i.z]` – David Arenburg Sep 17 '18 at 21:02
  • 1
    That works, thanks. I had tried this construct, but included white space around the operators (`c("y> = y min", "y <= y max")`). Frank's linked answer was helpful in seeing that point, which I haven't run across in the data.table documentation. – Chris Holbrook Sep 18 '18 at 00:21

0 Answers0