advice on how to make my algorithm faster

Question

here is my code in C for problem#3 from project-Euler, where I have to find the largest prime factor of 600851475143.

#include <stdio.h>
#include <stdlib.h>

bool is_prime(long int number){
     long int j;
     for (j=2; j<=number/2; j++){
             if (number%j==0) return false;
             if (j==number/2) return true;
            }
    }

int main(){

     long int input;
     scanf("%d", &input);
     long int factor;
     int ans=0;

     for (factor=input/2; factor>1; factor--){
             if (input%factor==0 && is_prime(factor)) {
                     ans = factor;
                     break;
                    }
            }

     printf("%d\n", ans);
     system("pause");
     return 0;
    }

Although it works fine for small numbers, gradually it takes more and more time for it to give an answer. And, finally, for 600851475143 the code returns 0, which is obviously wrong. Could anyone help? thanks a lot.

This can depend on your platform and how it defines an `int`, but look up `INT_MAX` to see why you might be running into problems with `600851475143` as your test value. Particularly, look more closely at your format specifier in `scanf()`. Compile with all and extra warnings (`-Wall -Wextra`) to see if you get warnings about overflow. — Alex Reynolds, Mar 03 '14 at 17:41
There are many, many articles on the internet about how to rapidly generate prime numbers; I would start by reading them. — Eric Lippert, Mar 03 '14 at 17:57
The most significant improvement you can do is only test up to the sqrt of the number. Start at sqrt(n) instead of input/2. — Z boson, Mar 03 '14 at 19:21
@EricLippert the fastest solution for this problem does not involve generating primes. (cf. e.g. http://stackoverflow.com/questions/15279278/finding-largest-prime-number-out-of-600851475143/15292911#15292911). — Will Ness, Mar 03 '14 at 22:16
@WillNess: Good point, though for later Project Euler questions it is extremely handy to have fast methods for generating primes, testing for primeness, and so on. Moreover, the whole point of PE is for people to learn about these topics in mathematics, so I think it's a good idea to encourage people to read broadly on these topics. — Eric Lippert, Mar 03 '14 at 23:14

templatetypedef · Accepted Answer · 2014-03-03T19:11:27.057

3

A few things to consider:

As @Alex Reynolds pointed out, the number you're trying to factor might be so large that it can't fit in an int. You may need to use a long or a uint64_t to store the number. That alone might solve the problem.
Rather than checking each divisor and seeing which ones are prime, you might instead want to try this approach: set n to 600851475143. For each integer from 2 upward, try dividing n by that integer. If it cleanly divides out, then divide out all copies of that number from n and record the largest prime factor as being the current integer. If you think about it a bit, you'll notice that the only divisors you'll consider this way are prime numbers. As a helpful hint - if n has no divisors (other than 1) less than √n, then it's prime. That might help give you an upper bound on your search space that's much tighter than the division by two trick you're using.
Rather than increasing the divisor by one, try testing out 2 as a divisor and then only dividing by odd numbers (3, 5, 7, 9, 11, etc.) No even number other than 2 is prime, so this halves the number of numbers you need to divide by.

Alternatively, create a file storing all prime numbers up to √600851475143 by downloading a list of primes from the internet, then just test each one to see if any of them divide 600851475143 and take the biggest. :-)

Hope this helps!

edited Mar 03 '14 at 19:11

answered Mar 03 '14 at 17:51

templatetypedef

362,284
104
897
1,065

1

+1 for good advice generally, but this is incorrect: *no number has a prime factor bigger than its square root.* For example, 22 has prime factors 2 and 11, and 11 is clearly larger than √22. It *is* still true, though, that you only have to check for divisors up to √n because any prime factor larger than √n will need to by multiplied by another prime factor that's smaller than √n. – Caleb Mar 03 '14 at 18:00
3. can be improved by excluding multiples of 3. Every prime except 2 and 3 has the form p=6*k-1 or p=6*k+1. – Lutz Lehmann Mar 04 '14 at 11:00
@LutzL, it can be improved even further by excluding multiples of 5 as well. This is called wheel factorization. In my answer I exclude multiples of 2, 3, and 5. – Z boson Mar 04 '14 at 13:08
But is the additional exclusion of 5 and 25 mod 30 worth the extra effort against using the more compact `for(int p=5,d=2; p*p<=N; p+=d, d=6-d){ `? I understand that the next wheel with 7 brings substantial reductions.--~~~~ – Lutz Lehmann Mar 04 '14 at 13:25
@LutzL, isn't p*p dangerous due to overflow (better to use sqrt(N)). – Z boson Mar 04 '14 at 15:02
Only if N is very close to the maximum of the int type used, so that p reaches sqrt(N)+4 having a square of N+8*sqrt(N)+16, i.e., N>INT_MAX-16-8*sqrt(INT_MAX). int has to be exchanged for long int or UINT64 for large numbers. For UINT64, the maximal save N is then (1<<64)-16-(1<<35). – Lutz Lehmann Mar 04 '14 at 15:31
@LutzL, I like your explanation for determining the max safe value. But could you tell me where the 4 comes from (sqrt(N)+4)? Is this assuming that you are incrementing p by up to four. So if p*p=N on one iteration on the next iteration p=sqrt(N)+4. – Z boson Mar 06 '14 at 10:46
Yes, the 2-3-wheel has increments of 2 and 4 alternatingly. So in the worst case `N=MAX_INT`, and the last admissible `p=[\sqrt(MAX_INT-1)]` gets increased to `p+4` with `(p+4)^2>MAX_INT`. A finer analysis might end up with tighter bounds. – Lutz Lehmann Mar 06 '14 at 11:04
@LutzL, I think you solved the equation wrong. It should be N + 8*sqrt(N)+16 = INT_MAX. If you solve this for N you end up with the quadratic formula . I'm using a 2-3-5 wheel so the max is 6 but I get the idea. – Z boson Mar 06 '14 at 12:43
So `N<=INT_MAX-8*sqrt(N)-16 – Lutz Lehmann Mar 06 '14 at 13:18
@LutzL, actually you had your math right. I had it wrong. It's N>INT_MAX-16-8*sqrt(INT_MAX) like you said. And it does appear computing the intger square root is better http://stackoverflow.com/questions/22556599/conditional-tests-in-primality-by-trial-division. – Z boson Mar 21 '14 at 13:43
On the last point, the jury may still be out. In the case of the link, testing of primality, you only need the square root once. Here you would have to recompute the square root every time a prime factor is found. It may depend on the wheel, without wheel it is plausible that the square root increases the speed. With a large wheel, this is not as clear. – Lutz Lehmann Mar 21 '14 at 13:59

score 2 · Answer 2 · answered Mar 03 '14 at 19:01

I suggest you to improve the primality check part of your code. The running time of your method is O(n²) so you should use a more efficient algorithm for this like the well-known Miller–Rabin primality test with O(klog³n). I provide a pseudo code here for you and you can write the code on your own:

Input: n > 3, an odd integer to be tested for primality;
Input: k, a parameter that determines the accuracy of the test
Output: composite if n is composite, otherwise probably prime
write n − 1 as 2s·d with d odd by factoring powers of 2 from n − 1
WitnessLoop: repeat k times:
   pick a random integer a in the range [2, n − 2]
   x ← ad mod n
   if x = 1 or x = n − 1 then do next WitnessLoop
   repeat s − 1 times:
      x ← x2 mod n
      if x = 1 then return composite
      if x = n − 1 then do next WitnessLoop
   return composite
return probably prime

I provide a link for you to see an implementation in python that also compares this algorithm with yours. BTW, there are many implementations of this algorithm all over the web but I think righting it by yourself may help you to better understand it.

Z boson · Answer 3 · 2014-03-05T10:29:50.767

2

Try the following code. It essentially implements the points in the accepted answer. The only improvement is that it skips all multiples of 2, 3, and 5 using wheel factorization http://en.wikipedia.org/wiki/Wheel_factorization

//find largest prime factor for x <2^64
#include <stdio.h>
#include <stdint.h>
int main() {
    uint64_t x = 600851475143;
    int wheel[] = {4,2,4,2,4,6,2,6};
    while(x>2 && x%2==0) x/=2;
    while(x>3 && x%3==0) x/=3;
    while(x>5 && x%5==0) x/=5;   
    for(uint64_t j=0, i=7; i<=x/i; i+=wheel[j++], j%=8) {
        while(x>i && x%i==0) x/=i;
    }
    printf("%llu\n", x);
}

Another thing that could be done is to pre-compute all primes less than 2^32 (rather than downloading them) and then only divide by the primes. The fastest method I know to do this is the Sieve of Eratosthenes. Here is a version using OpenMP which finds the primes up to 1 billion in less than one second http://create.stephan-brumme.com/eratosthenes/

edited Mar 05 '14 at 10:29

answered Mar 04 '14 at 12:48

Z boson

32,619
11
123
226

This code finds the largest factor smaller than x, but it was asked for the largest prime factor. For x=12 the result would be "2, 6", but 6 is not a prime factor. So it is really fastest to divide out the small prime factors, also accounting for their multiplicity. – Lutz Lehmann Mar 04 '14 at 13:30
@LutzL, I fixed it. It now returns the largest prime as 6857 which is the correct answer. – Z boson Mar 04 '14 at 13:59
That should do. Nice and compact. Only that factors 2,3,5 are not removed if present. Since the test number does not have them, it is inconsequential, but try with `x=600`. `while(x>2 && x%2==0) x/=2,while(x>3 && x%3==0) x/=3;while(x>5 && x%5==0) x/=5;` – Lutz Lehmann Mar 04 '14 at 14:08
@LutzL, thanks again! I added your changes. Now it works for all x. – Z boson Mar 04 '14 at 14:23
@LutzL, one improvement would probably be to not do the division twice (x%i, then x/i) – Z boson Mar 04 '14 at 14:24
One would have to check the assembly code. If there is a divmod instruction in the instruction set, chances are the compiler is clever enough to reuse the quotient computed during evaluation of x%i also for x/i. – Lutz Lehmann Mar 04 '14 at 14:27
and what would the answer for `x=125` be? also, your last paragraph is incorrect. with sieve it would be slower, and you couldn't avoid divisions entirely even if you had primes already (you'd just divide by primes instead of by 30-coprimes). You'd just perform slightly less of them. Sieving up to sqrt of 600851475143 is roughly 2 mln operations; its factorization with iterated division like here takes less than 1500 divisions. – Will Ness Mar 05 '14 at 06:37
here's why: `600851475143 == product [71,839,1471,6857] && all is_prime [71,839,1471,6857] && 1471 < sqrt(6857)`. – Will Ness Mar 05 '14 at 06:47
@WillNess, for x=125 the answer is 5 you can see it for yourself here http://coliru.stacked-crooked.com/a/19c70ae5d1181409 – Z boson Mar 05 '14 at 08:17
@WillNess. I thought about the sieve. You could use the sieve to form a table of all the primes less than 2^32. Then you can read this table and use it to test primes up to 2^64. This is basically what is said in the last paragraph of the accepted answer. It's just rather than downloading them you generate them yourself. – Z boson Mar 05 '14 at 08:18
@Zboson about 125 - my bad, should've payed attention to the `x>5` bit. :) -- about sieve: you don't need any primes here whatsoever, when you do a factorization of only one (this) number - it doesn't pay off. the highest factor you reach here is 1471. whether you run on odds, 6-coprimes, or primes, doesn't matter here at all. --- BTW `i<=sqrt(x)` is better re-written as `i <= x/i`, generally (here it doesn't matter, again). – Will Ness Mar 05 '14 at 10:03
@Will Ness, if matters if x is a very large prime itself. It's about a general algorithm. If the question is only about how to find a fast algorithm to find the largest prime of 600851475143 then the answer is trivial: return 6857. – Z boson Mar 05 '14 at 10:06
@Zboson I thought the OP asked "I have to find the largest prime factor of 600851475143". "return 6857" is no answer to *how can it be found* (efficiently). -- for very large primes trial division is no good anyway. Using pre-calculated primes instead of odds/wheels only improves the complexity by log(N) - not insignificant, yes, but far from extreme improvement. -- about the accepted answer: that is why I didn't up-vote it. :) there are 62113 primes below the sqrt of 600851475143: about 100x worse than what your code here is doing. (hence the upvote:)). – Will Ness Mar 05 '14 at 10:19
@WillNess, I modified the code with your suggestions. – Z boson Mar 05 '14 at 10:30
@WillNess, my current code takes 391 iterations (up to prime 1471) to answer the OP's question. There are 233 primes up to 1471. If I used a list of primes it would be about 40% faster. If x=2^63-25 (which is a prime) it probably makes a difference. I wonder how long that takes with my code... – Z boson Mar 05 '14 at 15:09
1

It takes 13.5 seconds to determine that 2^63-25 is a prime (sandy bridge @3.6 GHz). – Z boson Mar 05 '14 at 15:11
@Zboson 13.5 seconds with *your* code, yes? log(2^31.5) =~ 22, so it'd be ~2.3 seconds with primes only (*if* you already had them). I meant "significant" in how far can we practically reach: 2^63 or 2^65 doesn't matter, compared with 2^256. -- So the question becomes, can we sieve up to 2^31.5 in 11 seconds or less? :) Looks like http://primesieve.org/ can do this in ~1 second or less. – Will Ness Mar 05 '14 at 20:07
Yes 13.5 with my code. But I just realized the `i<=x/i` is slow. It has to do division every iteration. It's about twice as fast to use `i*i<=x` or sqrt(x). I'm look into now. I'm also using `x = x^64-59` which is the largest prime that fits in uint64_t (actually I do `x = -1L-58`. That link I posted with OpenMP fints the 2^31 primes in one second as well. – Z boson Mar 05 '14 at 20:17
@WillNess. You wrote "i<=sqrt(x) is better re-written as i <= x/i". Can you tell me why? See this http://stackoverflow.com/questions/22556599/conditional-tests-in-primality-by-trial-division – Z boson Mar 21 '14 at 13:42
@Zboson I was wary of the overflow; in Java it's better as `i*i <= x`. Will look into the link later. :) – Will Ness Mar 21 '14 at 14:39

advice on how to make my algorithm faster

3 Answers3

Linked