6

Actually, given N a (possibly very large) even integer, I want to find N = F * R where gcd(F,R) = 1, F>R, and F is as small as possible (since I'll be completely factoring F). The heart of the problem is finding the largest divisor R where R < sqrt(N).

For example, N=36 should give F=9 and R=4. Notice that R is not necessarily prime, or a prime power. Note that I am NOT factoring N. The only restriction on F and R is that they are relatively prime.

This is my quick and naive version, which is working:

def factor_partial(N):
    for R in xrange(int(math.sqrt(N)),1,-1):
        if N%R == 0 and gcd(R,N/R) == 1:
            return N/R, R

Another way I imagine doing it is by finding the divisors in increasing order, and removing any multiples of nondivisors along the way. Something like:

def factor_partial(N):
    i = range(2, int(sqrt(N)) + 1)
    while i:
        if N % i[0] != 0:
            remove_multiples(i, i[0]) #without removing i[0]
        else:
            if gcd(i[0], N/i[0]) == 1:
                R = i[0]
        i.pop(0) #remove i[0]

    return N/R, R

I think this will be slow and memory intensive, but perhaps if i is instead a generator it could be efficient. I haven't used generators much.

Can I improve the first version? Is the second version viable (how would I do it)? Is there a completely different method that is better?

Looking for answers in python, c, or pseudocode.


This is for a project for a class on number theory. I'm implementing a primality test based on Pocklington. While I need a factorization algorithm, we haven't studied any and I'm probably not going to use one such as the quadratic sieve which is outside the scope of my class. I'm looking for specific help with the question posed.

jmilloy
  • 7,875
  • 11
  • 53
  • 86
  • Does `F * R == N`? I'm not sure what all these variables mean – Blender Nov 25 '11 at 18:17
  • Your second method is the [Sieve of Eratosthenes](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). I suggest you look at [this question](http://stackoverflow.com/questions/2267146/what-is-the-fastest-factorization-algorithm) as well. – Blender Nov 25 '11 at 18:22
  • @Blender Yes, in a way. I wrote code for a classic sieve to generate primes, but it also doesn't use generators and runs out of memory. Maybe it's time to upgrade that one, too. – jmilloy Nov 25 '11 at 18:26
  • The sieve will not work for large numbers because you will have to store every preceding number at some point in memory, which will eat up RAM at a pretty rapid rate. This is just the number of numbers you will need to store to run the sieve (`n` is the number of numbers): `n(n+1)/2 `. – Blender Nov 25 '11 at 18:29
  • @Blender Why do I need to store preceding numbers? Notice that I'm *not* looking for prime factors, and in fact this code is sort of a complement to the sieve, because I'm removing multples of i where i is *not* a divisor of N. I only care about the largest (prime or composite) divisor, which I save as R, so I throw out each number after checking it. – jmilloy Nov 25 '11 at 18:53
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/5352/discussion-between-jmilloy-and-blender) – jmilloy Nov 25 '11 at 18:55
  • This is a math/algorithms questions explained using code, more than a programming question (for me, at least.) – razlebe Nov 25 '11 at 19:04
  • @razlebe Is there a better stackexchange site for such algorithmic questions? – jmilloy Nov 25 '11 at 19:08
  • Yes. The question could be asked sans code at math. You're struggling with the process, rather than the code you've written to implement it. That for me makes it OT for SO. – razlebe Nov 25 '11 at 20:15
  • @razlebe one potential answer is the second version but using python generators. that belongs here. you aren't required to know any math to do that. – jmilloy Nov 25 '11 at 20:18

3 Answers3

5

Wikipedia has a nice list of factoring algorithms: http://en.wikipedia.org/wiki/Integer_factorization#Factoring_algorithms

Your second approach effectively uses a sieve and has the nice property of quickly reducing the problem when N is a multiple of some small prime. The code can be improved by looping over primes rather than all possible divisors for 2..sqrt(n).

Also, you may want to start out with a primality test so that you know that N is composite before doing additional work.

Your note says that you are not factoring N but the problems are related. The search for F and R amounts to exploring non-overlapping combinations of the prime factors of N.

In the case of N==36, the prime factorization of N is 2, 2, 3, 3. The factors of F and R must include all of those (so that F*R==N) and there can be no overlap (so that GCD(F,R)==1). So 4 and 9 emerge immediately.

A more instructive example may be N==23256. Its factorization is 2,2,2,3,3,17,19. Since there can be no overlap between F and R, each prime base can only go into one of the two buckets (i.e. you either get all the twos or none of them). So, we can group the factors into 8,9,17,19. To find R, we want the combination of those factors that is as large as possible but below 152.49, the square-root of 23256. Our choices are {8}, {9}, {8,9}, {8,17}, {8,19}. The largest of those is 8*19 which is 152. The corresponding F is 17*19 or 153.

The choices listed above are computed as [choice for choice in powerset([8,9,17,19]) if prod(choice) < math.sqrt(N)].

So the whole program pretty much boils down to this:

prime_factors = factorize(N)      # [2,2,2,3,3,17,19]
clusters = [p**e for p, e in collections.Counter(prime_factors).items()]  # [8,9,17,19]
R = max(prod(group) for group in powerset(clusters) if prod(group) < math.sqrt(N))
F = N // R

The powerset search can be made faster by pruning the generation of sets whenever they exceed the square root on N.

Keep in mind that factorizing is computationally expensive and powersets grow very quickly but it is likely far less work than starting that the original algorithm which does many divisions starting at the square root of N and working downwards.

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • Let's say N is 36, then the solution I'm looking for is R=4, F=9. If I loop through primes, I don't find R=4. How do we recover this solution if we we restrict to checking only primes? – jmilloy Nov 25 '11 at 18:45
  • "4" should not even be listed in number factorization - what you need to find are the prime factors. In this case, you should find "2" twice, – jsbueno Nov 25 '11 at 18:51
  • @jsbueno No. I'm not factoring N. The only restrictions on F and R are that they are relatively prime. – jmilloy Nov 25 '11 at 18:58
  • 1
    @jmilloy: the point is that the easiest way of solving your problem is to first factor N and then group the factors. – Chris Dodd Nov 25 '11 at 21:12
  • Thanks. This is little bit comical - I'm *writing* a primality test, and the point is that it is supposed to be easier to find N-1 = F*R and factor F than to factor N. I decided to try to minimize F, which turns out (it seems) to be as hard as factoring N-1 in the first place. – jmilloy Nov 25 '11 at 21:41
0

Could you get the prime factorization of N and then find the greatest product combination of all of the prime factors that is less than sqrt(N)?

For example with 36 it would find that the prime factorization is 2*2*3*3. Then you would try all the different combinations of the primes:

2 = 2
3 = 3
2*2 = 4
2*3 = 6
3*3 = 9
2*2*3 = 12
2*3*3 = 18
2*2*3*3 = 36

And you know that those are all factors of 36, so you find the largest one such that it is less than sqrt(36) which turns out to be 4.

However, I don't really see how that could be much faster than just doing your first version unless you've already got an existing list of prime numbers or prime factorizations, or some awesome prime factorization algo, or you're doing all of this with extremely large numbers.

But even then (back to the first version raving) O(sqrt(n)) is a pretty fast runtime, and only requires O(1) memory, so really the first algo might just be the way to go. I don't see how it would be slow, especially in C on a modern computer.

Andrew Rasmussen
  • 14,912
  • 10
  • 45
  • 81
  • Well, yeah having a factorization of N would solve the *whole* problem... but this algorithm is in lieu of getting a factorization of N :P – jmilloy Nov 25 '11 at 20:02
  • Ohhh haha. Well then I would just stick with your first version. It's very efficient in both memory and runtime. Good luck! – Andrew Rasmussen Nov 25 '11 at 20:08
  • There is not need to consider combinations like R=2*3 because then F gets the other primes F=2*3 and those would not be relatively prime. So you either get all the twos or none of them in a given factor. Accordingly, the possible candidate factors are {1, 4, 9, 36}. – Raymond Hettinger Nov 25 '11 at 22:18
  • I'm afraid I don't quite follow your explanation? – Andrew Rasmussen Nov 25 '11 at 22:26
  • The OP wants two factors, F and R which multiply to N, has R as larger as possible (upto the sqrt(N), and are relatively prime to each other. That latter restriction means that 2*3 isn't a candidate for R (because the corresponding F=N/R wouldn't be relatively prime to R). This restriction helps prune the search space considerably. – Raymond Hettinger Nov 26 '11 at 00:44
0
def factor_partial(N):
    R = int(sqrt(float(N)))
    sieve = [1, 1] + [0] * (R-1)
    for i in xrange(2, R) :
        if sieve[i]==0 :
            j=i*i;
            while j<=R :
                sieve[j]=1
                j = j + i
    primes = [i for i in xrange(R+1) if sieve[i]==0]

    saveN = N
    primepower_divisors = []
    for i in primes :
        if N%i == 0 :
            term = 1
            while N%i == 0 :
                term = term * i
                N = N / i
            primepower_divisors = primepower_divisors + [term]
            if N==1 : break

    largecoprime_divisors = [1]
    for i in primepower_divisors :
        for j in largecoprime_divisors :
            largecoprime_divisors = largecoprime_divisors + [i*j]

    F = min([i for i in largecoprime_divisors if i>R])
    return F, saveN/F

I have Used sieve method to calculate list of primes (There are a lot of optimizations possible in calculating list of primes) We could use the fact that .. there will be no prime p such that F%p == 0 and R%p == 0 . since gcd(F, R)=1

anshul410
  • 824
  • 2
  • 11
  • 23
  • Usually min([i for i in largecoprime_divisors if i>R]) fails because it has an empty sequence. Other times it is slower than my first version. – jmilloy Nov 25 '11 at 20:14
  • 1
    Yeah this one is slower .. I just found out .. :) Perhaps You should use Polard's Rho .. – anshul410 Nov 25 '11 at 20:33