6

I can successfully use foverlaps with a small sample of my dataset, but when use the full data (data.tables with over 30k rows), it breaks down and throws the following error:

Error message:

Error in if (any(x[[xintervals[2L]]] - x[[xintervals[1L]]] < 0L)) stop("All entries in column ",  :
  missing value where TRUE/FALSE needed

The way I am interpreting the error message is that there are no overlaps between the two data.tables.

Q1-Am I interpreting the message well?

Q2-Any idea why this might happen with the larger dataset? Is it possible that this is due to the size of the dataset?

I do have a lot of unique values, which according to foverlaps help file, can be expected to slow things down proportionally, but not before it get into millions of rows, which is far from being the case here. Thank you.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
jpinelo
  • 1,414
  • 5
  • 16
  • 28
  • 5
    This often indicates an `NA` value being fed to the `any` function, so it returns `NA` and that's not a legal logical value. – Carl Witthoft May 07 '15 at 13:50
  • Removing NAs fixed the error. – jpinelo Jul 24 '15 at 11:17
  • 1
    @jpinelo could you check provided answer and provide feedback if it is not solving your problem? or if it does, then accepting it to mark question answered. Thanks – jangorecki Jun 13 '20 at 19:23
  • @jpinelo if that was your upvote on my answer a moment ago, please note that accepting answer is marking the tick, upvote won't make the answer as resolved. Thank you – jangorecki Jun 14 '20 at 13:29

1 Answers1

1

There is no reproducible example so it is not possible to investigate your issue.
As stated by Carl in comment it is likely caused by NA values present in input.
In the recent development version there has been some improvements made to foverlaps by Arun. One of those improvements is better error message when NA values are detected.

install.packages("data.table")

This feature is already on CRAN as of 1.12.2.

jangorecki
  • 16,384
  • 4
  • 79
  • 160