9

I looked around and found other questions that had answers but none of them address the scope of this particular question., including this question, and also this one.

I have to compute the LCM of large ranges of numbers in an efficient way. I didn't look too in-depth at those other questions because they don't deal with ranges of numbers that are as large as the ones this algorithm must process.

The code I've got right now can calculate the LCM of every number between 1 and 350000 in about 90 seconds. (The resulting number is some 76000 decimal digits long). I hope to eventually be able to scale it over ranges that are millions or maybe even billions of elements long.

It will probably be paralellized eventually. With some algorithms this won't be hard at all, for others it will be trickier (if, for example, the algorithm uses the currently generated LCM to compute primality for other parts of its computation)

Here it is:

public static BigInteger getLCMOfRange(BigInteger lower, BigInteger upper)
{
    BigInteger M = BigInteger.ONE;
    BigInteger t;
    
    // long l = System.currentTimeMillis();
    // System.out.println("Calculating LCM of numbers up to " + upper + "...");
    for (; lower.compareTo(upper) != 1; lower = lower.add(BigInteger.ONE))
    {
        t = M.gcd(lower);
        if (t.compareTo(lower) == 0)
            continue;
        M = M.multiply(lower).divide(t);
    }
    // System.out.println("Done.  Took " + (System.currentTimeMillis() - l) + " milliseconds.  LCM is " + M.bitCount()+ " bits long.");
    return M;
}

Note that unlike a typical for loop, this function operates over [lower, upper] instead of [lower, upper). This behavior is intentional.

A bit of supporting math is that the LCM of some set of numbers product of the set of prime factors from which any one of the numbers can be produced without requiring any outside of the pool. If my range is [1,20], I can represent that in the following way:

1: 1         6:  3*2      11: 11       16: 2^4
2: 2         7:  7        12: 3*2^2    17: 17
3: 3         8:  2^3      13: 13       18: 3^2*2
4: 2^2       9:  3^2      14: 7*2      19: 19
5: 5         10: 5*2      15: 5*3      20: 5*2^2

LCM{[1,20]}: 2^4*3^2*5*7*11*13*17*19 = 232792560

Are there any more efficient ways to compute an LCM over such a large range?

I don't care if the algorithm someone suggests is very memory-heavy, time performance is much more important (and also more expensive) than memory performance in this case.

This is not a homework question.

Question

What is the most efficient way to calculate the LCM of a very large range of numbers? This algorithm needs to operate on prohibitively wide ranges of numbers and must thus be carefully optimized.

Addendum 1

A closely related question is: What's the most efficient way to calculate the logarithm of one BigInteger (base another BigInteger)? The resulting value can be truncated to the nearest integer.

Community
  • 1
  • 1
Wug
  • 12,956
  • 4
  • 34
  • 54
  • It's notable that I can probably use prime numbers to speed this up somehow. It's also notable that I can use this range of numbers to calculate prime numbers: if gcd(LCM, n) is prime and n is less than the square of the top of the range, n must be prime). This will let me check primality of numbers less than max^2 in approximately linear time (to the size of LCM) – Wug Aug 15 '12 at 21:28
  • For the logarithm sub-question: loga(b) = logx(b)/logx(a), where x could be any constant, so simply dividing the binary (or decimal) length of the two numbers should give you an approximation. Or is that not accurate enough? – biziclop Aug 15 '12 at 22:01
  • @biziclop I need the exact value of floor(a log b base c), where a, b, and c are presumably all bigintegers. Though in practice, I may not run this algorithm with large enough inputs to require that c be a biginteger. – Wug Aug 16 '12 at 14:41

3 Answers3

3

This is the layout of the algorithm. I assume that you always start from 1:

  1. Find prime numbers in the range. You can use Eratosthenes sieve for 350000. For bigger range of number, you'd need segmented sieve.

  2. For each prime number p, use logarithmic function to find the largest exponent e that pe is within the range. Multiply pe to the LCM. (The optimization details is up to your implementation)

Why is it correct?

  • For numbers in the form pe where p is prime, and e >= 1, due to step 2, has been include in the LCM, so pe | LCM.
  • Other numbers will have the form N = p1e1p2e2...pnen (where pi are pairwise distinct primes numbers and ei >= 1), which is larger or equal to piei (for all i from 1 to n). Since piei | LCM, due to previous argument, N | LCM.
Community
  • 1
  • 1
nhahtdh
  • 55,989
  • 15
  • 126
  • 162
  • This is the solution I had in mind but it only works if your range starts from 1. – biziclop Aug 15 '12 at 22:09
  • To make it work for a range of numbers from x,y. You only need to remove the unique primes from 1,x-1. For example, if the range is 14,20. You will only need to remove the primes between 20/2 = 10 and 14.(10,14). So you will remove 11, 13. You don't need to remove 7 because it appears in the prime factorization of 14. – JustinDanielson Aug 15 '12 at 22:20
  • 1
    @JustinDanielson I fail to see how that would make it work. How would you find the biggest exponent of 2 for example, if your range was simply (40,40)? – biziclop Aug 15 '12 at 22:23
  • @biziclop: I thought of a solution for arbitrary range, but it may not be efficient for small range (in which case prime factorization is a better choice). First step is the same (sieve always starts from 1). For the 2nd step: find p^e so that it is the largest number that is <= the upper bound. If p^e is >= the lower bound, then include in LCM. Otherwise find largest p^e' so that p^e + p^e' is >= lower bound, include p^e' in the LCM. – nhahtdh Aug 15 '12 at 22:33
  • This will probably be the answer I use. I can also check primality of a number by taking the GCD of it and my cumulative LCM (as long as it's less than the LCM's current upper bound squared, GCD == 1 indicates prime) – Wug Aug 16 '12 at 14:44
  • @Wug: It would require integer division if you use GCD algorithm with the current LCM, which is much more expensive (as LCM can grow very large) than finding the largest power of each prime number in the range. – nhahtdh Aug 16 '12 at 15:00
2

This is a generalisation of the answer of @nhahtdh

First step, find all the primes that are less than or equal to the upper bound.

Then take each prime p and write down both the lower and upper bound in a base p notation. The highest digit that is different in the two numbers is the exponent of p that you need to include in your LCM. If lower bound is 1, this is trivially the same as the other answer.

Note that the complexity of this algorithm doesn't depend on the length of the range, only the magnitude of the upper bound.

biziclop
  • 48,926
  • 12
  • 77
  • 104
  • The step to find the highest different digit is O(log N) where N is the upper bound. – nhahtdh Aug 15 '12 at 23:08
  • Exactly. And there are approximately N/logN prime numbers less than N, so step two should be O(N). – biziclop Aug 15 '12 at 23:11
  • What you described is not correct for range 63-65 in base 3: they only differ at the least significant bit. – nhahtdh Aug 16 '12 at 13:38
  • @nhahtdh Then we need to refine that rule a bit. The aim is to find the highest number of trailing zeroes (modulo p) in the range. So if lower or upper bound already contains some, you should consider it. – biziclop Aug 16 '12 at 14:46
0

Instead of a prime generator and some explicit means to determine largest powers under the limit n (350000),
Try a modified SoE:

Use "all ones" (~0) or -1 for MULTIPRIME: has more than one distinct prime factor,
0 or 1 for CANDIDATE.
For each natural number up to l = sqrt(n), keep in sieve an integer large enough to hold l, initialised to 0 for prime candidate.
Set sieve at multiples of a prime p to p if zero, else to MULTIPRIME if not equal to p.

The Least Common Multiple is the product of all elements of sieve > CANDIDATE and not MULTIPRIME (> CANDIDATE if CANDIDATE > MULTIPRIME).


(Depending on memory system and CPU, re-assigning values or avoiding it

    sieve[p] = sieve[p] | p == p ? p : MULTIPRIME;
// or
    if (sieve[p] != MULTIPRIME)
        if (sieve[p] == CANDIDATE)
            sieve[p] = p;
        else if (sieve[p] != p)
            sieve[p] = MULTIPRIME;

may be slower.)

greybeard
  • 2,249
  • 8
  • 30
  • 66