0

I need to do the following process using R: 1)I have a df1 that looks

Chrom Start End
1 0 30
1 50 80
1 100 140
2 20 40
2 60 100
2 110 130

2)And a df2 that looks

Chrom TAG Position
1 AX-1020 55
2 AX-1320 75

3)So, I would like to add info from df2 into df1 using the interval between Start and End in df1 and Position in df2 as I show here

Chrom Start End TAG Position
1 0 30 NA NA
1 50 80 AX-1020 55
1 100 140 NA NA
2 20 40 NA NA
2 60 100 AX-1320 75
2 110 130 NA NA

I hope anyone can help me. I'd appreaciate a lot!

akrun
  • 874,273
  • 37
  • 540
  • 662
Nicolas
  • 3
  • 1
  • Nice post on this topic: [Overlap join with start and end positions](https://stackoverflow.com/questions/24480031/overlap-join-with-start-and-end-positions) – Henrik Mar 01 '21 at 20:28

1 Answers1

1

Also data.table

library( data.table )
#read sample data as data.tables
df1 <- fread("Chrom     Start   End
1   0   30
1   50  80
1   100     140
2   20  40
2   60  100
2   110     130")
df2 <- fread("Chrom     TAG     Position
1   AX-1020     55
2   AX-1320     75")

#update non-equi join
df1[ df2, `:=`(TAG = i.TAG, 
               Position = i.Position ), 
     on = .(Chrom, Start <= Position, End >= Position) ][]
#    Chrom Start End     TAG Position
# 1:     1     0  30    <NA>       NA
# 2:     1    50  80 AX-1020       55
# 3:     1   100 140    <NA>       NA
# 4:     2    20  40    <NA>       NA
# 5:     2    60 100 AX-1320       75
# 6:     2   110 130    <NA>       NA
Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • 2
    Perhaps I'm mistaken, but you seem to make a left join, while I perform a join by reference. – Wimpel Mar 01 '21 at 19:39