I have many large dataframes. I want to get the first column is same and second column's difference is less than 5000 between two dataframes. such as:
>a
chr pos
chr2 10000
chr2 20000
chr2 45000
chr2 60000
chr2 80000
chr2 100000
>b
chr pos
chr2 10000
chr2 30000
chr2 40000
chr2 55000
chr2 80000
my expected result:
>c
chr pos
chr2 10000
chr2 45000
chr2 60000
chr2 80000
I tried by this:
c<-data.frame(chr=0, pos=0)
for (i in 1:nrow(b)) {
c1<-a[(a$chr %in% b[i, 1]) & abs(a$pos-b[i, 2])<=5000, ]
c<-rbind(c, c1)
}
c<-c[-1, ]
But it's too slow and bad effiency. I hope to get a better way. Thanks advance!