Placing of modulo operation in code

Question

I have been trying to solve a problem in which the solution comes down to computing the value of

(n+m-2 Combinations m-1).

This is what I have written. In cases where the answer goes beyond the 10^9+7, value I need to print my_answer%(10^9+7).

mod_val=10**9+7
current=[int(x) for x in raw_input().strip().split()]
m=current[0]-1
n=current[1]-1
hi,lo=max(m,n),min(m,n)
num_prod=1
den_prod=1
for each in xrange(1,lo+1):
    den_prod=den_prod*each
    num_prod=num_prod*(hi+each)
print (num_prod//den_prod)%mod_val

But the modulo operation is right at the bottom after all the computations have been completed. Is there a way where I can put the modulo operation somewhere in between the code in order to save computations or improve performance?

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

Background

Facts:

(m + n) C n = (m + n)! / (n! m!) = (1 / n!) * ((m + n)! / m!)

Looking at the code:

Line 1: den_prod=den_prod*each represents (1 / n!)

Line 2: num_prod=num_prod*(hi+each) represents a reduced form of ((m + n)! / m!).

Solution

The key ideas are to use modular exponentiation in the for loop, then applying the divide modulo operation on the results. The divide operation turns into a multiplication of a modulo and a modular inverse. Finally, to calculate the modulo inverse we use Euler's Theorem.

def mod_inv (a, b):
    return pow(a, b - 2, b)

mod_val=10**9+7
current=[int(x) for x in raw_input().strip().split()]
m=current[0]-1
n=current[1]-1

hi,lo=max(m,n),min(m,n)
num_prod=1
den_prod=1
for each in xrange(1,lo+1):
    den_prod = (den_prod*each) % mod_val
    num_prod = (num_prod*(hi+each)) % mod_val

print (num_prod * mod_inv(den_prod, mod_val)) % mod_val

Performance

I timed 3 different solutions to the problem. Timing 5000 combinations: (5000 C n) where n goes from 0 to 4999.

Code 1: The solution above

def mod_inv (a, b):
    return pow(a, b - 2, b)

mod_val=10**9+7

hi = 5000
for lo in range(0, hi-1):

    # Code 1
    num_prod=1
    den_prod=1
    for each in xrange(1,lo+1):
        den_prod = (den_prod*each) % mod_val
        num_prod = (num_prod*(hi+each)) % mod_val

    output = (num_prod * mod_inv(den_prod, mod_val)) % mod_val
    # print output

Timing 1:

real    0m3.607s
user    0m3.594s
sys     0m0.011s

Code 2: The solution you presented

mod_val=10**9+7

hi = 5000
for lo in range(0, hi-1):

    # Code 2
    test1 = 1
    test2 = 1
    for each in xrange(1,lo+1):
        test1 = (test1*each)
        test2 = (test2*(hi+each))
    
    test_output = (test2 / test1) % mod_val
    # print test_output

Timing 2:

real    0m25.377s
user    0m25.337s
sys     0m0.027s

Code 3: scipy solution

from scipy.misc import comb

hi = 5000
for lo in range(0, hi-1):

    # Code 3
    c = comb(hi+lo, lo, exact=True)
    # print c

Timing 3:

real    0m36.700s
user    0m36.639s
sys     0m0.048s

API for Large Inputs - Lucas Theorem

def mod_inv (a, b):
    return pow(a, b - 2, b)

def small_nCr (n, r, mod):
    hi = max(r, (n - r))
    lo = min(r, (n - r))
    num_prod=1
    den_prod=1
    for each in range (1, lo + 1):
        den_prod = (den_prod * each) % mod
        num_prod = (num_prod * (hi + each)) % mod
    small_c = (num_prod * mod_inv (den_prod, mod)) % mod
    return small_c

def lucas (n, r, mod):
    c = 1
    while (n > 0 or r > 0):
        ni = n % mod
        ri = r % mod
        if (ri > ni):
            return 0
        c = c * small_nCr (ni, ri, mod)
        n = n / mod
        r = r / mod
    return c

def nCr (n, r, mod):
    return lucas (n, r, mod) % mod

Note: If the modulo value is not prime, you can apply the Chinese Remainder Theorem.

Sources:

Modulo properties

Modulo exponentiation

Modular multiplicative inverse function - usage of pow

Lucas Theorem

Your answer works for arguments less than 10^9+7. For arguments of that size or larger, you can reduce to your case using Lucas's theorem: https://en.wikipedia.org/wiki/Lucas%27_theorem . For efficiency, I would also use the extended Euclidean algorithm to calculate inverses, which I believe is considerably faster than Fermat's little theorem. — Edward Doolittle, Aug 07 '15 at 04:14
Timings for Extended Euclidean algorithm: `real 0m3.645s user 0m3.631s sys 0m0.011s`, pretty much the same. Will look into Lucas, thanks for the heads up! The code for the extended euclidian algorithm is here: https://en.wikibooks.org/wiki/Algorithm_Implementation/Mathematics/Extended_Euclidean_algorithm — The Brofessor, Aug 07 '15 at 04:29

Placing of modulo operation in code

1 Answers1

Background

Solution

Performance

API for Large Inputs - Lucas Theorem