1

I'm trying to write a python code to find the prime factors of any given number

def pf(n):
    for i in range(2,n):
        if n%i==0: #find the factors
            for j in range(2,i): #check if the factor is prime
                if i%j==0:
                    break
            else: #find the prime ones
                print(i)

My problem is that this code works fine with small numbers however with big numbers i have to interrupt the execution
for example:

pf(600851475143)
71
839
1471
6857
Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    pf(600851475143)
  File "<pyshell#1>", line 2, in pf
    for i in range(2,n):
KeyboardInterrupt  

the prime factors of this big number were found in less than a second, so my question is how to tweak this code to stop unnecessary iterations after finding the factors with the use of the for not the while loop

tvandenbrande
  • 348
  • 5
  • 21
Dear_user
  • 19
  • 4
  • 2
    Well how do you know you *have* found all the prime factors? For a start, note that you can stop at e.g. `sqrt(i)`, and that `2` is the only even prime. – jonrsharpe Nov 26 '14 at 11:46
  • because no matter you wait it's the same result, after 1 second or 30 minutes – Dear_user Nov 26 '14 at 12:03
  • You could start by finding the factors of very small numbers and see if it works there – Tobias Kienzler Nov 26 '14 at 12:04
  • @Dear_user sorry, I mean: *"how do you decide **programatically** that you've found them all?"* To tell a computer to do something, you need to be able to do it yourself. You *could* code that if no more have been found in some defined time period, stop there, for example - that would be how to model your current thought process. – jonrsharpe Nov 26 '14 at 12:07
  • @jonrsharpe: i got you the 1st time, i can't know that it found them all, for this one by testing only – Dear_user Nov 26 '14 at 12:37

5 Answers5

1

You can speed things up by dividing n by the obtained value in each iteration step. This way you decrease the number you are iterating. I would implement something like this (not yet sure if this is optimal and results in the lowest number of operations):

from math import sqrt

def pf(n):
    if n == 1: 
        return
    if n % 2 == 0:
        print(2)
        pf(n/2)
        return
    for i in range(3, int(sqrt(n))+1, 2):
        if n % i == 0:
            for j in range(3, int(sqrt(i))+1, 2):
                if i % j == 0:
                    break
            else:
                print(i)
                pf(n/i)
                return
    print(n)

Note, if using the improvement of looping until the root of the number we omit the case that the number itself is a prime number. However, if the function does not result any prime factors it is save to assume that the input is a prime number itself.

The return statements stop the main loop (and the function) after the recursive call. So each call of the function only results in one value and a call for the function on the result of the division of the number by its found prime.

If you make a set with all the prime numbers and check if the value is in this set you will win some time, instead of looping over all values.

Compared to the non-recursive solution by jonrsharpe this one is almost four times as fast:

>>> print timeit.timeit("pf(600851475143)", setup="from __main__ import pf", number=1)
71
839
1471
6857
0.00985789299011
>>> print timeit.timeit("pf2(600851475143)", setup="from __main__ import pf2", number=1)
71
839
1471
6857
0.0450129508972

The implementation is limited by the overflow limit of range(), which results in an overflow for the input value (600851475143**2)+1. More details on the maximum size for range can be found in this question: Python: Range() maximum size; dynamic or static?

A possible issue with this solution could be that the maximum recursion depth is achieved. For more details on that visit this question: Maximum recursion depth

Community
  • 1
  • 1
tvandenbrande
  • 348
  • 5
  • 21
  • i used this trick too when i used to list the answers in a list and it took a long time too – Dear_user Nov 26 '14 at 12:05
  • based on the question I thought he only wanted them to be printed and not create a generator. – tvandenbrande Nov 26 '14 at 12:45
  • @Dear_user I added the timing results for a case on my computer. About 28 miliseconds for a value or order `10^12`. – tvandenbrande Nov 26 '14 at 14:26
  • 1
    @tvandenbrande testing for e.g. `pf(360)` (hint: or any even `n`) causes `UnboundLocalError: local variable 'i' referenced before assignment`. Also, you will hit the recursion limit for numbers with large numbers of prime factors. – jonrsharpe Nov 26 '14 at 14:32
  • @jonrsharpe Found the issue, didn't paste my latest working code here. Forgot to add the prime check. Thank you for pointing that out. – tvandenbrande Nov 26 '14 at 14:34
  • 1
    Yes, your revision works fine for me - the recursive approach is very neat. Note that you can make it faster by adopting the `range` arguments used in my answer - `for i in range(3, int(sqrt(n))+1, 2):` and `for j in range(3, int(sqrt(i)+1), 2):`. Also, using `for: else:` (see https://docs.python.org/3/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops) is neater than having a `prime` flag. With these revisions I get 0.03s for recursive vs 0.04s for yours unmodified and 0.14s for mine. Note another slight difference - mine prints only unique prime factors. – jonrsharpe Nov 26 '14 at 14:39
  • 1
    I would prefer to get the full decomposition and not the unique primes. This way you can recompose the input and check if your result is correct. Good hint about the `for: else:` notation. It even improves speed (from 11.9 ms to 9.9 ms on my system) – tvandenbrande Nov 26 '14 at 14:49
  • Interestingly, I have further found that the primality check is unnecessary - because of the way your approach works, the first factor found at each step is prime. I haven't yet convinced myself that this is always true, but haven't yet found a counter example. That would save further processing time. I agree that the full decomposition is useful - again, more so if the list were returned. – jonrsharpe Nov 26 '14 at 14:52
  • 1
    Was googling a bit on your comment (is the prime check necessary) and found this interesting wiki with an example code that is even faster: http://rosettacode.org/wiki/Prime_decomposition#Python – tvandenbrande Nov 26 '14 at 14:55
0

You could try adding prime factors to a list as you find them, and see if they multiply to make the number you are trying to factorize, but I think that might add more time than it would save.

As suggested in the comments, you could also stop at the square root of the number - using for i in range(2, sqrt(n) + 1):.

In terms of generally speeding it up you could also try creating a set of primes, and adding to it when you find them in the 5th line. Example:

if i in primes:
   print(i)
else:
    for j in range(2,i):   # check if the factor is prime
        if i%j==0:
            break
rlms
  • 10,650
  • 8
  • 44
  • 61
  • and how long the list of primes should be? and about the sqrt(n) suggestion is relatively helpful (for example what is i want pf(n**2), your first suggestion is the best and i have to write a code to multiply the factors each time and compare it to the number inside the loop – Dear_user Nov 26 '14 at 12:07
0

One further point - use xrange() rather than range(), so you do not internally create the list of all numbers to iterate: (if you are using Python 2 !)

What is the difference between range and xrange functions in Python 2.X?

Community
  • 1
  • 1
Ricardo Magalhães Cruz
  • 3,504
  • 6
  • 33
  • 57
0

just iterate square root of value, this is how you can iterate through less nombers and use generators to skip repeated iteration usinf for else

from math import sqrt
def pf(n):
    n = int(sqrt(n))
    for i in xrange(2, n): # in python2 use `range` for python3
        for j in xrange(2,i):
            if i%j == 0:
                break
        else:
            yield i # this will return when prime nomber will found.

print list(pf(4356750))
Vishnu Upadhyay
  • 5,043
  • 1
  • 13
  • 24
0

Here's how I would do it:

from math import sqrt

def pf(n):
    """Print the prime factors of n."""
    if n % 2 == 0:
        print(2)
    for i in range(3, int(sqrt(n))+1, 2):
        if n % i == 0: # i is a factor of n
            for j in range(3, int(sqrt(i))+1, 2):
                if i % j == 0:
                    break
            else: # i is also prime
                print(i)

By factoring out the checks for 2 you can almost halve the search space, and using the fact that all prime factors must be below the square root of a number you cut it down even further. This takes about a quarter of a second for 600851475143:

>>> import timeit
>>> timeit.timeit("pf(600851475143)", setup="from __main__ import pf", number=1)
71
839
1471
6857
0.27306951168483806

Another option would be to use a prime sieve to generate all primes below n, then filter out those that are also factors of n (effectively the reverse operation).

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • thanks this one is helpful for example number (12-digits number) however i tried it for (600851475143**2)+1 and it stuck – Dear_user Nov 26 '14 at 12:59
  • 1
    @Dear_user no kidding - that's 3.6e23, for which you have to check through `600851475144` candidate factors and `24` actual factors; its [largest prime factor is `171015606064053029`](http://www.wolframalpha.com/input/?i=prime+factors+of+(600851475143**2)%2B1), to determine the primality of which you have to check `413540332` candidate divisors. If factoring numbers that large were easy, [public-key cryptography](http://en.wikipedia.org/wiki/Public-key_cryptography) would be in serious trouble. – jonrsharpe Nov 26 '14 at 13:46
  • A problem I noticed, also with my answer is that if n itself is a prime, the result is not included. – tvandenbrande Nov 26 '14 at 13:53
  • 1
    @tvandenbrande this has the same issue - one fix would be to return a list of prime factors, rather than printing within the function; if that's empty, you know `n` is itself prime. This also makes the function more useful in other contexts. – jonrsharpe Nov 26 '14 at 14:05