1

I have a data table with start_date and end_date. Now we are on the 2010-01-05 and I want to filter this table, such that, I get returned only the rows that include my date. I understand that this can easily be done using vector scan, by:

library(data.table)
dt <- data.table(start_date=20100101:20100120, end_date=20100105:20100124, value= 1:20)
dt[start_date <= 20100105 & end_date >20100105, ]

This yields:

dt[start_date <= 20100105 & end_date >20100105, ]
   start_date end_date value
1:   20100102 20100106     2
2:   20100103 20100107     3
3:   20100104 20100108     4
4:   20100105 20100109     5

However, this will be inefficient for very large tables (20-50 Millions of rows). I know I can select a certain date using binary search facility of data.table by writing dt[.(20100102, 20100106), ], if the table is keyed. But how can I leverage the binary search such that I can scan for the ranges as in above exercises.

Nicolas
  • 11
  • 1

0 Answers0