0

I have 2 largish dataframes and I want to search through the larger one and then replace a value based on a test of two variables (an integer and a char) between the two. As shown.

But it is taking a massive amount of time. Is there a way in R to speed this up and to get a good display of actual progress!

for (i in 1:146385) {
   
   progress(i)

  for (j in 1:2078) {


if (AltOne[i,4] == Alt1codes[j,1] & AltOne[i,7] == Alt1codes[j,2]) {AltOne[i,8] <- Alt1codes [j,6]}

  } }
  • Read about merge, closing as duplicate, see linked post to see if it works for you. If it doesn't provide example data, and expected output. – zx8754 Nov 11 '20 at 08:33
  • As zx8754 said, part of your solution can be implemented as a merge or a join of your data frames. You may need to filter the output rows to keep what is of interest to you later. That solution will possibly have a larger memory footprint, but it will be much faster. If you need further help, please, provide a good reproducible example to let us help you. See: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610 – zeehio Nov 11 '20 at 09:09
  • Thanks. Im interested in your comment "...you may need to filter the output rows". Can you elaborate? As I get more rows in the output then the input! Eg. df1 #146385 recs & df2 #2078. Yet merge (have tried multiple types, inner outer etc) is #152796 recs? –  Nov 12 '20 at 01:48

0 Answers0