0

I have these tables below that consists of a million rows and here i want to see if there is an overlap between Table 1 Target start- table 2 Target start, and so on with target end. How can i achieve this in R?

Table 1 Target start and Target ends (numerics)

TargetStart TargetEnd
7756        8357
35598       35009
9954126     9954371
9954126     9954346
9954126     10115435

Table2 Target start and Target ends (numerics)

Target_Start  Target_End        
7000           10000        
23184775       23190099 
900000         1000000      
23157928       23165621 
23157410       23158724     

Desired table would consist of the status overlaps

Target_Start  Target_End    Overlaps
7000          10000         yes
23184775      23190099      no
90000         1000000       yes
23157928       23165621     no
23157410      23158724      no  

Please anyone can you guys help me in this? thanks

p/s: i made some correction on the typo i made previously

1 Answers1

0

You can use data.table::inrange like this - note that you most likely have a typo in Table 1 and the result you expected in Desired Table based on the actual inputs is incorrect

cbind(tbl1, tbl2) %>%
  mutate(Overlaps = as.logical(pmax(data.table::inrange(TargetStart, Target_Start, Target_End), data.table::inrange(TargetEnd, Target_Start, Target_End))))

  # TargetStart TargetEnd Target_Start Target_End Overlaps
# 1        7756      8357         7000      10000     TRUE
# 2       35598     35009     23184775   23190099    FALSE
# 3     9954126   9954371        90000     100000    FALSE
# 4     9954126   9954346     23157928   23165621    FALSE
# 5     9954126  10115435     23157410   23158724    FALSE

Data

tbl1 <- read.table(text="TargetStart TargetEnd
7756        8357
35598       35009
9954126     9954371
9954126     9954346
9954126     10115435", header=TRUE)

tbl2 <- read.table(text="Target_Start  Target_End        
7000           10000        
23184775       23190099 
90000          100000       
23157928       23165621 
23157410       23158724", header=TRUE)
CPak
  • 13,260
  • 3
  • 30
  • 48