I am working on 4C data where I have a .txt file that contains chromosome, start,end, nReads, RPMs, p.value, q.value and I am only interested in significant interactions in chr15 and later want to filter the interactions that are farther than 100kb and nearer to 3kb.
library(r3Cseq)
library(BSgenome.Hsapiens.UCSC.hg19.masked)
library(GenomicRanges)
library(Homo.sapiens)
kura.int <- read.table("KURA_DpnII.interaction.txt", header = T)
kura_data <- kura.int[kura.int$chromosome == "chr15" & kura.int$q.value > 0.1, ]
kura.int.gr <- makeGRangesFromDataFrame(kura_data, keep.extra.columns = T)
id <- "91433"
rccdGene <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene,
filter=list(gene_id=id))
rccdPromoter <- start(rccdGene)
kura_end <- ((rccdPromoter+kura_data$end)/2)
kura <- cbind(rccdPromoter, kura_end)
kura_2 <- cbind(kura, kura_data$chromosome)
colnames(kura_2) <- c("start", "end", "chr")
kura_3 <- kura_2[distance(kura_2$start, kura_2$end)<=100000]
In "kura_2" matrix I have 3 columns namely "chr", "start" and "end" where I have a new start as a promoter of the gene and different endings. So I tried the wrote the above block of code but when I come to the filtering step used function "distance" I am getting this error
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘distance’ for signature ‘"character", "character"’
Now I have a kura_2 matrix which contains 3 columns namely "chr" "start" "end"
start end chr
1 91498106 86026693 chr15
2 91498106 91466684 chr15
3 91498106 88330238 chr15
4 91498106 91488399.5 chr15
5 91498106 91491012.5 chr15
6 91498106 91768848 chr15
Now, how do I filter the genomic interactions that are more than 100kb and less than 3kb between the start and end?
The new start is the promoter of the gene and the new end is ((start+end)/2) that's the reason I have float values because in this way it is easy to plot interactions from my promoter (bait). Is there a better way to filter out the interactions? Thank you in advance