Most efficient way to find all factors with GMPY2 (or GMP)?

Question

I know there's already a question similar to this, but I want to speed it up using GMPY2 (or something similar with GMP). Here is my current code, it's decent but can it be better?

Edit: new code, checks divisors 2 and 3

def factors(n):
    result = set()
    result |= {mpz(1), mpz(n)}


    def all_multiples(result, n, factor):
        z = mpz(n)
        while gmpy2.f_mod(mpz(z), factor) == 0:
            z = gmpy2.divexact(z, factor)
            result |= {mpz(factor), z}
        return result

    result = all_multiples(result, n, 2)
    result = all_multiples(result, n, 3)

    for i in range(1, gmpy2.isqrt(n) + 1, 6):
        i1 = mpz(i) + 1
        i2 = mpz(i) + 5
        div1, mod1 = gmpy2.f_divmod(n, i1)
        div2, mod2 = gmpy2.f_divmod(n, i2)
        if mod1 == 0:
            result |= {i1, div1}
        if mod2 == 0:
            result |= {i2, div2}
    return result

If it's possible, I'm also interested in an implementation with divisors only within n^(1/3) and 2^(2/3)*n(1/3)

As an example, mathematica's factor() is much faster than the python code. I want to factor numbers between 20 and 50 decimal digits. I know ggnfs can factor these in less than 5 seconds.

I am interested if any module implementing fast factorization exists in python too.

The answer depends very much on the size of _n_. Can you tell us what size of _n_ (how many decimal digits) you want to factor? — user448810, May 17 '14 at 11:35

score 7 · Accepted Answer · edited Jan 23 '18 at 02:06

I just made some quick changes to your code to eliminate redundant name lookups. The algorithm is still the same but it is about twice as fast on my computer.

import gmpy2
from gmpy2 import mpz

def factors(n):
    result = set()
    n = mpz(n)
    for i in range(1, gmpy2.isqrt(n) + 1):
        div, mod = divmod(n, i)
        if not mod:
            result |= {mpz(i), div}
    return result

print(factors(12345678901234567))

Other suggestions will need more information about the size of the numbers, etc. For example, if you need all the possible factors, it may be faster to construct those from all the prime factors. That approach will let you decrease the limit of the range statement as you proceed and also will let you increment by 2 (after removing all the factors of 2).

Update 1

I've made some additional changes to your code. I don't think your all_multiplies() function is correct. Your range() statement isn't optimal since 2 is check again but my first fix made it worse.

The new code delays computing the co-factor until it knows the remainder is 0. I also tried to use the built-in functions as much as possible. For example, mpz % integer is faster than gmpy2.f_mod(mpz, integer) or gmpy2.f_mod(integer, mpz) where integer is a normal Python integer.

import gmpy2
from gmpy2 import mpz, isqrt

def factors(n):
    n = mpz(n)

    result = set()
    result |= {mpz(1), n}

    def all_multiples(result, n, factor):
        z = n
        f = mpz(factor)
        while z % f == 0:
            result |= {f, z // f}
            f += factor
        return result

    result = all_multiples(result, n, 2)
    result = all_multiples(result, n, 3)

    for i in range(1, isqrt(n) + 1, 6):
        i1 = i + 1
        i2 = i + 5
        if not n % i1:
            result |= {mpz(i1), n // i1}
        if not n % i2:
            result |= {mpz(i2), n // i2}
    return result

print(factors(12345678901234567))

I would change your program to just find all the prime factors less than the square root of n and then construct all the co-factors later. Then you decrease n each time you find a factor, check if n is prime, and only look for more factors if n isn't prime.

Update 2

The pyecm module should be able to factor the size numbers you are trying to factor. The following example completes in about a second.

>>> import pyecm
>>> list(pyecm.factors(12345678901234567890123456789012345678901, False, True, 10, 1))
[mpz(29), mpz(43), mpz(43), mpz(55202177), mpz(2928109491677), mpz(1424415039563189)]

score 0 · Answer 2 · answered Oct 21 '21 at 04:59

There exist different Python factoring modules in the Internet. But if you want to implement factoring yourself (without using external libraries) then I can suggest quite fast and very easy to implement Pollard-Rho Algorithm. I implemented it fully in my code below, you just scroll down directly to my code (at the bottom of answer) if you don't want to read.

With great probability Pollard-Rho algorithm finds smallest non-trivial factor P (not equal to 1 or N) within time of O(Sqrt(P)). To compare, Trial Division algorithm that you implemented in your question takes O(P) time to find factor P. It means for example if a prime factor P = 1 000 003 then trial division will find it after 1 000 003 division operations, while Pollard-Rho on average will find it just after 1 000 operations (Sqrt(1 000 003) = 1 000), which is much much faster.

To make Pollard-Rho algorithm much faster we should be able to detect prime numbers, to exclude them from factoring and don't wait unnecessarily time, for that in my code I used Fermat Primality Test which is very fast and easy to implement within just 7-9 lines of code.

Pollard-Rho algorithm itself is very short, 13-15 lines of code, you can see it at the very bottom of my pollard_rho_factor() function, the remaining lines of code are supplementary helpers-functions.

I implemented all algorithms from scratch without using extra libraries (except random module). That's why you can see my gcd() function there although you can use built-in Python's math.gcd() instead (which finds Greatest Common Divisor).

You can see function Int() in my code, it is used just to convert Python's integers to GMPY2. GMPY2 ints will make algorithm faster, you can just use Python's int(x) instead. I didn't use any specific GMPY2 function, just converted all ints to GMPY2 ints to have around 50% speedup.

As an example I factor first 190 digits of Pi!!! It takes 3-15 seconds to factor them. Pollard-Rho algorithm is randomized hence it takes different time to factor same number on each run. You can restart program again and see that it will print different running time.

Of course factoring time depends greatly on size of prime divisors. Some 50-200 digits numbers can be factoring within fraction of second, some will take months. My example 190 digits of Pi has quite small prime factors, except largest one, that's why it is fast. Other digits of Pi may be not that fast to factor. So digit-size of number doesn't matter very much, only size of prime factors matter.

I intentionally implemented pollard_rho_factor() function as one standalone function, without breaking it into smaller separate functions. Although it breaks Python's style guide, which (as I remember) suggests not to have nested functions and place all possible functions at global scope. Also style guide suggests to do all imports at global scope in first lines of script. I did single function intentionally so that it is easy copy-pastable and fully ready to use in your code. Fermat primality test is_fermat_probable_prime() sub-function is also copy pastable and works without extra dependencies.

In very rare cases Pollard-Rho algorithm may fail to find non-trivial prime factor, especially for very small factors, for example you can replace n inside test() with small number 4 and see that Pollard-Rho fails. For such small failed factors you can easily use your Trial Division algorithm that you implemented in your question.

Try it online!

def pollard_rho_factor(N, *, trials = 16):
    # https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
    import math, random
    
    def Int(x):
        import gmpy2
        return gmpy2.mpz(x) # int(x)
    
    def is_fermat_probable_prime(n, *, trials = 32):
        # https://en.wikipedia.org/wiki/Fermat_primality_test
        import random
        if n <= 16:
            return n in (2, 3, 5, 7, 11, 13)
        for i in range(trials):
            if pow(random.randint(2, n - 2), n - 1, n) != 1:
                return False
        return True
    
    def gcd(a, b):
        # https://en.wikipedia.org/wiki/Greatest_common_divisor
        # https://en.wikipedia.org/wiki/Euclidean_algorithm
        while b != 0:
            a, b = b, a % b
        return a
        
    def found(f, prime):
        print(f'Found {("composite", "prime")[prime]} factor, {math.log2(f):>7.03f} bits... {("Pollard-Rho failed to fully factor it!", "")[prime]}')
        return f
        
    N = Int(N)
    
    if N <= 1:
        return []
    
    if is_fermat_probable_prime(N):
        return [found(N, True)]
    
    for j in range(trials):
        i, stage, y, x = 0, 2, Int(1), Int(random.randint(1, N - 2))
        while True:
            r = gcd(N, abs(x - y))
            if r != 1:
                break
            if i == stage:
                y = x
                stage <<= 1
            x = (x * x + 1) % N
            i += 1
        if r != N:
            return sorted(pollard_rho_factor(r) + pollard_rho_factor(N // r))
    
    return [found(N, False)] # Pollard-Rho failed

def test():
    import time
    # http://www.math.com/tables/constants/pi.htm
    # pi = 3.
    #     1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679
    #     8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196
    # n = first 190 fractional digits of Pi
    n =   1415926535_8979323846_2643383279_5028841971_6939937510_5820974944_5923078164_0628620899_8628034825_3421170679_8214808651_3282306647_0938446095_5058223172_5359408128_4811174502_8410270193_8521105559_6446229489
    tb = time.time()
    print('N:', n)
    print('Factors:', pollard_rho_factor(n))
    print(f'Time: {time.time() - tb:.03f} sec')
    
test()

Output:

N: 1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489
Found prime factor,   1.585 bits...
Found prime factor,   6.150 bits...
Found prime factor,  20.020 bits...
Found prime factor,  27.193 bits...
Found prime factor,  28.311 bits...
Found prime factor, 545.087 bits...
Factors: [mpz(3), mpz(71), mpz(1063541), mpz(153422959), mpz(332958319), mpz(122356390229851897378935483485536580757336676443481705501726535578690975860555141829117483263572548187951860901335596150415443615382488933330968669408906073630300473)]
Time: 2.963 sec

This is a very old question of mine. Afterwards I did try to implement algorithms like Pollard rho and quadratic sieve in a faster language like C++. But I appreciate the answer. — qwr, Oct 21 '21 at 07:46
@qwr I also implemented quadratic sieve both in Python and C++. Quadratic sieve is really-really fast, it can factor 256 bit within 10 minutes even if it has just two equal in size factors. Of course there is an even faster GNFS algorithm but it is very complex to implement. Quadratic sieve has intermediate-level of speed and complexity. Here I implemented Pollard Rho just as an example of quite simple to implement but yet fast algorithm. Quadratic Sieve is a way to complex for posting on StackOverflow. StackOverflow has 50 000 chars limit for post and source of quadratic sieve is larger. — Arty, Oct 21 '21 at 08:23
@qwr I'm very fond of Quadratic Sieve. Can you tell what is the largest bit-size of number that your C++ quadratic sieve can factor? And what time it spends to do this? I want to compare to my timings, to figure out if I have optimized version or it needs much more optimizations. If you don't remember timings or don't have time for experiments, just skip this comment-request. — Arty, Oct 21 '21 at 11:52
I have code written a long time ago that was never pretty and there are a lot of completely wasteful things there, but you can see it https://github.com/jxu/Factorize — qwr, Oct 21 '21 at 23:08

Most efficient way to find all factors with GMPY2 (or GMP)?

2 Answers2