How to loop through vectors when the lengths of the vectors are changing?

Question

I have two vectors of the same lengths initially. This first is full of protein modification sites I.E. "E123". The second is a unique code for the literature reference to this site. I need to go through these vectors to remove multiple references to the same site from the same paper. That is, if VectorOne[1] == VectorOne[2] && VectorTwo[1] == VectorTwo[2], I need to remove the duplicate. The problem is when I use for loops to loop through the data I am potentially changing the lengths of the vectors meaning that the indices I'm using may no longer be correct.

As soon as I have removed a single element from the vectors the value I am looping to length(primarySite) is too high and the code crashes.

Here is an example of the first 10 values from these two vectors:

primarySite[1:10]
 [1] ""     ""     "D248" "E241" "E242" "E241" "E242" "D244" "D244" "E241"
sitePMID[1:10]
 [1] 24641686 24055347 23955771 23955771 23955771 23955771 23955771 23955771 23955771 23955771

Desired Output:
primarySite[1:6]
 [1] ""     ""     "D248" "E241" "E242" "D244" 
sitePMID[1:6]
 [1] 24641686 24055347 23955771 23955771 23955771 23955771 


for(i in 1:length(primarySite)){
      for(j in (i+1):length(primarySite)){
        if(primarySite[i] == primarySite[j] && sitePMID[i] ==      
sitePMID[j]){
      primarySite <- primarySite[-j]
      sitePMID <- sitePMID[-j]

    } 
  }
}

There are many better ways to do this. You most likely don't even need a loop (let alone 2!). It'd be easier to help if you provide sample data for primarySite and sitePMID. — Shree, May 30 '19 at 18:32
Do you want to remove all duplicates or only consecutive duplicates? — Gregor Thomas, May 30 '19 at 19:00

score 0 · Answer 1 · answered May 30 '19 at 20:27

0

This is easy if we put the vectors in a data frame:

data = data.frame(primarySite, sitePMID)
deduplicated_data = unique(data)

You can find many other ways in the R-FAQ

answered May 30 '19 at 20:27

Gregor Thomas

136,190
20
167
294

How to loop through vectors when the lengths of the vectors are changing?

1 Answers1