0

I have a list of geneids. (1024, 284, 526). I have their pertaining sequences in in a file (represented as ff). I extracted the names(ff) in desc. the file ff could contain a geneid multiple times. My loop only extracts the first hit (only 1 sequence). I want to extract all the sequences that match my gene id (could more than one sequence). file ff could contain (1024, 1024, 1024, 284, 526). I want to extract all the three sequences with the id 1024. But my loop extracts only the first 1024

for(i in 1:length(geneids)){

   temp<- ff[match(geneids[i],as.numeric(gsub("\\_.*","", desc)))]
   subset.seq <- c(subset.seq,temp) 
   subset.seq<-subset.seq[!sapply(subset.seq, is.null)]
   #print(subset.seq)
   temp1 <- names(temp)
   temp2 <-c(sapply(temp, function(x) x))
   s2 <- c(s2,temp1)
   s1 <- c(s1,temp2)
i <- i + 1}
user2498657
  • 379
  • 2
  • 6
  • 16
  • 2
    This makes little to no sense at all as someone not sitting in front of your data. I can't even begin to figure out what is going on here. Please try to reduce your question to a simple, minimal and reproducible example with code and an attempted solution. See [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – thelatemail Sep 12 '13 at 06:14
  • The resetting of the index `i` is not needed. You should describe what you are doing and at the very least offer `str()` output from `geneids` and `ff`. Even better would be `dput(head(ff))` and `dput(head(geneids))`. – IRTFM Sep 12 '13 at 06:18
  • I have edited my question I hope it makes more sense now. Sorry for not explaining it properly – user2498657 Sep 12 '13 at 06:41
  • I really need to solve this problem please help – user2498657 Sep 12 '13 at 17:36

1 Answers1

0

Just by removing match I was able to resolve the problem. Match only gets the first hit.

for(i in 1:length(geneids)){

temp <- c(temp,ff[geneids[i] == as.numeric(gsub("\\_.*","", desc))])

}

user2498657
  • 379
  • 2
  • 6
  • 16