Why am I getting only marginal improvements in time when I cut the computations in half?

Question

I wanted to write a program which will print and count all the primes between 1 and k. The idea is that we take each odd number i between 1 and k and check for odd factors of i up to sqrt(i). It took my computer 1 minute 47 seconds to go check for primes up to k = 1 000 000.

Then I tried to improve my code: instead of checking all odd numbers from 1 to k, I removed all the multiples of 3, 5, and 7 from these odd numbers, so basically I removed about half the numbers we have to check. I ran the program again, and with k = 1 000 000 I got a time of 1 minute and 40 seconds, only a 7 second improvement.

Here is my code:

   #The program has some weird quirks 
#(it doesn't print 3 and 5 because of some errors I was getting) but other than that 
#it seems to work fine

#k is the number that we check up to
k <- 1000000

#n is the number of primes, initialized at 2
n <- 2

#We take only the odd numbers between 5 and k
y <- seq(5, k, 2)

#We take each member of i and check it for primality
for (i in y) {
  #i is assumed to be prime until proved otherwise
  prime <- TRUE

  #We check the remainder when i is divided by every odd number less than its square root
  for (j in c(2, seq(3, ceiling(sqrt(i)), 2))) {
    if (i %% j == 0) {

      #If it's found that some number divides i, we set prime to false and move on 
      #to the next i
      prime <- FALSE
      break
    }
  }
  #If no number has divided i, we haven't broken the loop so R will get to this point
  #We shouldn't need the if statement, but for some reason the n counter 
  #gets too big if I remove the statement
  if (prime) {
    print(i)
    n <- n + 1
  }
}

#Print out the number of primes
print(paste("There are", n, "prime numbers between", 1, "and", k))

In the version where I also remove multiples of 3, 5, and 7, I just define y as

y <- setdiff((setdiff((setdiff(seq(5, k, 2), seq(6, k, 3))), seq(10, k, 5))), seq(14, k, 7))

Sorry it's very messy, but it works.

Have you checked how long the ´y <- ´ part is taking? I don't know R but I assume the setdiff will consume some of the time you are saving in later steps. — SaiBot, Mar 26 '18 at 08:36
@SaiBot That part should only be ran once in the program, and I tried that part in isolation and the result was instant. Do you think this is enough to exclude that as a suspect? — Ovi, Mar 26 '18 at 08:39
The reason that removing multiples of 3, 5, and 7 didn't help too much is that the composites you removed are all found very quickly anyway, so you did not actually cut the computation in half. Most of the time in your function is taken up by finding primes and composites with no small factors. Those require many iterations of the inner loop. — Matt Timmermans, Mar 26 '18 at 12:28
@MattTimmermans Would you have any suggestions of how to modify the algorithm to make it faster? The only thing that comes to my mind is to make a running vector of primes that I've already found and divide only those. — Ovi, Mar 26 '18 at 15:40
@Ovi that's a very large topic. The vector of primes will help a lot, but the sieve method that Dan Hall posted is better — Matt Timmermans, Mar 26 '18 at 15:54
As @MattTimmermans has pointed out, this is a very large topic with a lot of active research. I would suggest looking at the source of any of the numerous packages that generates primes (e.g. `numbers`, `primefactr`, `sfsmisc`, `spuRs`, etc.). Another great resource is [https://primesieve.org/](https://primesieve.org/), which `RcppAlgos::primeSieve` is based off of. It's in C++ but is very easy to read and follow. Lastly, I wrote a pretty **[thorough overview](https://stackoverflow.com/a/48313378/4408538)** of this topic that should be helpful. — Joseph Wood, Apr 01 '18 at 12:01

De Novo · Accepted Answer · 2018-03-26T09:31:51.667

3

Profile your code to find out where the bulk of the time is. As a general rule, don't use generics that have to determine their methods. Swap seq.int for seq. Especially in R, you can't just count the number of computations you think something will take and cut down on that. Actually look at how things are called.

Here's how to profile your code:

Call Rprof(tmp <- tempfile()), then execute your algorithm, then call Rprof();summaryRprof(tmp). Look at the calls with a high total.pct, and make your adjustments there.

If you want a fast function for quickly determining all the primes up to some number in R, here's one for you. I don't think I wrote it, but my file doesn't tell me where I got it. It's an implementation of the sieve of Eratosthenes.

sieve <- function(n){
  n <- as.integer(n)
  primes <- rep(TRUE, n)
  primes[1] <- FALSE
  last.prime <- 2L
  fsqr <- floor(sqrt(n))
  while (last.prime <= fsqr){
    primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
    sel <- which(primes[(last.prime+1):(fsqr+1)])
    if(any(sel)){
      last.prime <- last.prime + min(sel)
    }else last.prime <- fsqr+1
  }
  which(primes)
}

EDIT: I most certainly didn't write this, though my memory made me think I might have implemented it in R from some other language, this is not the case. A quick google search of the code shows that I got it directly from a post here several years back when I was just a lurker. The first sieve answer there was posted by @GeorgeDontas: https://stackoverflow.com/a/3790309/7936744

EDIT 2: Now looking back through my files, I see my very minimal tweaks can eke a little bit more time out of this, and, probably unreliably, avoids a rare factor of 10 increase:

prime_logic <- function(n){
  n <- as.integer(n)
  primes <- rep_len(TRUE, n)
  primes[1] <- FALSE
  last.prime <- 2L
  fsqr <- floor(sqrt(n))
  while (last.prime <= fsqr){
    primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
    sel <- which(primes[seq.int(last.prime+1, fsqr+1)])
    if(any(sel)){
      last.prime <- last.prime + min(sel)
    }else last.prime <- fsqr+1
  }
  primes
}

Unit: milliseconds
               expr      min       lq     mean   median       uq       max neval cld
       sieve(1e+06) 29.80531 36.41955 50.24195 38.53144 41.86716 889.47422   100   a
 prime_logic(1e+06) 22.19734 33.05792 39.35390 35.52268 41.02838  71.68033   100   a

edited Mar 26 '18 at 09:31

answered Mar 26 '18 at 08:41

De Novo

7,120
1
23
39

Thanks for the update! I am a complete beginner so I am still working on understanding everything in your answer :) – Ovi Mar 26 '18 at 08:49
Glad it was helpful! Learning to use the profiler is huge if you're interested in efficiency. [This](https://stat.ethz.ch/pipermail/r-help/2007-October/143891.html) might be helpful as you work through understanding the output. – De Novo Mar 26 '18 at 08:52
Ah that's great! – Ovi Mar 26 '18 at 08:53
I'm not really sure how helpful the last function is as you are only return a vector of booleans. I guess you could call `sum` to determine the number of primes less `n`. Also, I don't understand what you mean by "rare factor of 10" as the only difference between `prime_logic` and `sieve` is a call to `which`. Last thing, you should give credit to @John. [His answer](https://stackoverflow.com/a/3791284/4408538) was just below George Donta's answer and you will see that it is exactly the same as your function `sieve`. – Joseph Wood Apr 01 '18 at 11:51
@JosephWood look at `prime_logic` again. The (as I said, minimal) performance difference doesn't come from removing the call to `which`. And you can see the microbenchmark data is right there, so you don't have to be confused or guess about the performance. Run your own. It's a little faster. I removed the call to which because I didn't need it. There are a number of reasons you'd want the prime index. It is, in fact, just as useful in computation as a vector of the numbers themselves. – De Novo Apr 01 '18 at 18:08
@DanHall, I put both functions in text comparer and found that, `prime_logic` is only different in name, `rep_len` instead of `rep`, `seq.int` instead of `:`, and the use of `which`. I'm sorry for claiming only the latter. Anywho, generally speaking, when benchmarking, you should test functions that have the same output. When I ran _my own benchmarks_, as you suggest, and added `which` to `prime_logic`, in order to return the same object, I obtained virtually the same results. Again, as I said earlier, you should give credit to @John. Lastly, I still don't understand _"rare factor of 10"_. – Joseph Wood Apr 02 '18 at 00:11
Also, my benchmarks show that the slight gain in performance is solely due to the removal of `which`. – Joseph Wood Apr 02 '18 at 00:21
@JosephWood I'm not sure what the problem is. The data and the code are in the post. The rare factor of 10 is right there in the summary. I mentioned it to indicate that factor of 10 is unreliable and shouldn't be used as evidence for which is faster. You can see that, as I said, it's a series of small tweaks, switching from S3 generics to primitives and returning a logical index instead of an integer index. Both outputs can be used in the same computations as is, so it is reasonable to compare them. If you actually want to find a specific prime number, neither is an efficient choice. – De Novo Apr 02 '18 at 03:16

niko · Answer 2 · 2018-03-26T17:54:52.510

2

Here is an alternative function to @DanHall's answer, which uses your logic (as in your testing method for primes):

primetest <- function(i, k){
  if(i >= 5 && i < k){
    vtest <- i %% c(2, seq(3, ceiling(sqrt(i)), 2))
    condition <- !is.element(0, vtest)
    if(condition){return(i)}
  }
}
V1 <- unlist(sapply(y, primetest, k = k))
length(V1)
# number of primes between 5 and k
[1] 78495
# here are the first 100 primes yielded, i.e. >= 5
V1[1:100]
 [1]   5   7  11  13  17  19  23  29  31  37  41  43  47  53  59  61  67  71  73  79  83  89
 [23]  97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199
 [45] 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337
 [67] 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463
 [89] 467 479 487 491 499 503 509 521 523 541 547 557

It is most definitely faster than using a loop, but I must admit I do not know if other answers are quicker or not.

edited Mar 26 '18 at 17:54

answered Mar 26 '18 at 08:45

niko

5,253
1
12
32

You use a loop, by the way, and `sapply`, while being a nice, convenient, and protective wrapper, is not the most efficient loop. If you want to use a loop wrapper, `vapply` is the most efficient. Bare `for` loops tend to win out for efficiency though. – De Novo Mar 26 '18 at 08:50
@DanHall I am really baffled by the computation time difference between `sapply` and `vapply`, thanks for that (changed it right away). – niko Mar 26 '18 at 08:55
Using the vectorized structure of an object and not looping at all is the way to beat a bare `for` loop, but don't think you're using `lapply` family functions for time efficiency. – De Novo Mar 26 '18 at 08:55
@DanHall I am not that familiar yet with such issues, but I am somewhat puzzled: the original method took well above 60s, the `sapply`/`vapply` methods 12s/2s resp. So isn't the `lapply` family helping regarding time efficiency (even if it isn't the most efficient method)? – niko Mar 26 '18 at 09:04
In my experience, when you actually measure with, e.g., `microbenchmark`, a well constructed for loop is often more time efficient than the apply functions. But you always have to test it. – De Novo Mar 26 '18 at 09:12
@nate.edwinton I'm not familiar with these apply functions yet. Could you please tell me what I have to plug into the function to get the primes up to 1 000 000? I would like to compare the times on my computer. Also, your program doesn't actually print out the primes, does it? – Ovi Mar 26 '18 at 15:39
@Ovi No it doesn't, I thought the purpose was solely to count the number of primes. I edited my code, now it returns the primes as well. – niko Mar 26 '18 at 17:52

score 1 · Answer 3 · answered Mar 26 '18 at 08:45

I know this is somehow off topic, but I thought it is a useful note here.

When numbers go higher, you cannot use this method for finding primes at all. You need to use probabilistic methods. One of the most famous for checking if a number is prime or not, is Miller–Rabin primality test which is used for example in RSA's prime numbers.

Why am I getting only marginal improvements in time when I cut the computations in half?

3 Answers3