0

This is an R question.

I have two matrices, "y" and "l":

> head(y)
    SNP Category
1 29351  exclude
2 29357  exclude
3 29360  exclude
4 29372  include
5 29426  include
6 29432  include

> head(l)
  start  stop
1   246 11012
2 11494 13979
3 14309 18422
4 20728 20995
5 21457 29345
6 30035 31693

If a row in matrix y has the value "include" in the second column, I want to check if the corresponding value in the first column in matrix y lies on or between a "start" and "stop" value in matrix l. If the value in matrix y does lie on or between the values in matrix l, then in matrix y replace the value "include" with "exclude". I guess I could do it with nested for loops but wanted to know a more elegant and faster way. The matrices are of unequal length. Thank you.

user3479780
  • 525
  • 7
  • 18
  • I might consider doing a merge like those suggested [here](http://stackoverflow.com/questions/24480031/roll-join-with-start-end-window) – MrFlick Nov 13 '14 at 03:50

1 Answers1

0

This worked, but was slow.

y <- read.csv(file="SNP_pos_categorised0.99cutoff.csv", header=T)
l <- read.csv("SNPsToMoveFromINCLUDEtoEXCLUDE.csv", header=T)

colnames(y)
#[1] "SNP"      "Category"

levels(y$Category)
#[1] " exclude" " include"

colnames(l)
#[1] "start" "stop"

#start processing
for(i in 1:nrow(y))
{
    if(y[i,"Category"]==" include")
    {
        for(j in 1:nrow(l))
        {
            if(y[i,"SNP"] >= l[j,"start"] & y[i,"SNP"]<= l[j,"stop"])
            {
                y[i, "Category"] <- replace(y[i,"Category"], y[i,"Category"]==" include", " exclude" )
            }
        }
    }
}
user3479780
  • 525
  • 7
  • 18