For loops in R are extremely slow but I know no alternative way of how to achieve the following.
As shown in this screenshot:
What I want the output format to look like:
> gene_id tss_id x y
in which, x = isosub$q1_FPKM / iso.agg$q1_FPKM // (correspond gene_id)
y = isosub$q2_FPKM / iso.agg$q2_FPKM
Here is my code with the for
loop:
length = length(isosub$gene_id)
tmp = data.frame(isosub$gene_id, isosub$q1_FPKM, isosub$q2_FPKM)
j = 1
denominator_q1 = iso.agg$q1_FPKM[j]
denominator_q2 = iso.agg$q2_FPKM[j]
gene_id = isosub$gene_id
tmpq1 = tmp$isosub.q1_FPKM
tmpq2 = tmp$isosub.q2_FPKM
isoq1 = iso.agg$q1_FPKM
isoq2 = iso.agg$q2_FPKM
o2_q1 = rep(0, length)
o2_q2 = rep(0, length)
i = 0
for (i in 1:length){
if (gene_id[i+1] == gene_id[i]){
o2_q1[i] = tmpq1[i] / denominator_q1
o2_q2[i] = tmpq2[i] / denominator_q2
}else{
o2_q1[i] = tmpq1[i] / denominator_q1
o2_q2[i] = tmpq2[i] / denominator_q2
j = j + 1
denominator_q1 = isoq1[j]
denominator_q2 = isoq2[j]
}
}
when length = 1000
, system.time
shows that:
> user system elapsed
> 55.74 0.00 56.45
And my actual length is even larger: 13751.