0

I want to efficiently calculate ((X+Y)!/(X!Y!))% P (P is like 10^9+7)

This discussion gives some insights on distributing modulo over division. My concern is it's not necessary that a modular inverse always exists for a number. Basically, I am looking for a code implementation of solving the problem.

For multiplication it is very straightforward:

public static int mod_mul(int Z,int X,int Y,int P)
{
// Z=(X+Y) the factorial we need to calculate, P is the prime
long result = 1;
while(Z>1)
  {
    result = (result*Z)%P
    Z--;
  }
return result;
}

I also realize that many factors can get cancelled in the division (before taking modulus), but if the number of divisors increase, then I'm finding it difficult to efficiently come up with an algorithm to divide. ( Looping over List(factors(X)+factors(Y)...) to see which divides current multiplying factor of numerator).

Edit: I don't want to use BigInt solutions.

Is there any java/python based solution or any standard algorithm/library for cancellation of factors( if inverse option is not full-proof) or approaching this type of problem.

Community
  • 1
  • 1
crazydiv
  • 812
  • 9
  • 30
  • 1
    "My concern is it's not necessary that a modular inverse always exists for a number." If P is a prime, then the only integers for which inverses mod P don't exist are multiples of P. It's simple to detect those in advance. – user2357112 Jan 18 '14 at 07:27
  • @user2357112: Better yet -- in the question that the OP links to, the accepted answer actually gives a formula for the modular inverse when the modulus is a prime. – ruakh Jan 18 '14 at 07:36
  • Yes, but that adds an extra layer of check to see if b is prime and then compute b^(p-2)%p. I want to confirm if this could be avoided or not. – crazydiv Jan 18 '14 at 07:43
  • b does not need to be prime, only p. This needs checking only once, and `pow(b, p-2, p)` is likely to be faster than any fancy running cancelation of factors you might try to do. – Paul Hankin Jan 18 '14 at 08:30

3 Answers3

3

((X+Y)!/(X!Y!)) is a low-level way of spelling a binomial coefficient ((X+Y)-choose-X). And while you didn't say so in your question, a comment in your code implies that P is prime. Put those two together, and Lucas's theorem applies directly: http://en.wikipedia.org/wiki/Lucas%27_theorem.

That gives a very simple algorithm based on the base-P representations of X+Y and X. Whether BigInts are required is impossible to guess because you didn't give any bounds on your arguments, beyond that they're ints. Note that your sample mod_mul code may not work at all if, e.g., P is greater than the square root of the maximum int (because result * Z may overflow then).

Tim Peters
  • 67,464
  • 13
  • 126
  • 132
  • 1
    I think Lucas's theorem helps only if m or n are much greater than P. Otherwise their expansion in base P is trivial, and Lucas's theorem degenerates to choose(m, n) == choose(m, n) mod P. The question here gives P=10^9+7, so I expect m and n to be relatively small. – Paul Hankin Jan 18 '14 at 08:10
  • Could be! I couldn't guess what "is like" (as in "P is like 10^9+7") was supposed to mean, though. – Tim Peters Jan 18 '14 at 08:19
  • I was not looking for a specific problem here but all cases in general(understanding the logic involved). Lucas Theorem definitely helps when P is comparably small. Sorry for not being very clear in the question.Thanks :) – crazydiv Jan 18 '14 at 08:51
3

It's binomial coefficients - C(x+y,x).

You can calculate it differently C(n,m)=C(n-1,m)+C(n-1,m-1).

If you are OK with time complexity O(x*y), the code will be much simpler.

http://en.wikipedia.org/wiki/Combination

maxim1000
  • 6,297
  • 1
  • 23
  • 19
3

for what you need here is a way to do it efficiently : -

  1. C(n,k) = C(n-1,k) + C(n-1,k-1)

  2. Use dynamic programming to calculate efficient in bottom up approach

  3. C(n,k)%P = ((C(n-1,k))%P + (C(n-1,k-1))%P)%P

    Therefore F(n,k) = (F(n-1,k)+F(n-1,k-1))%P

Another faster approach : -

  1. C(n,k) = C(n-1,k-1)*n/k

  2. F(n,k) = ((F(n-1,k-1)*n)%P*inv(k)%P)%P

inv(k)%P means modular inverse of k.

Note:- Try to evaluate C(n,n-k) if (n-k<k) because nC(n-k) = nCk

Vikram Bhat
  • 6,106
  • 3
  • 20
  • 19
  • 1
    Great answer Vikram. The second formula you suggested should be crazy fast. For the inverse I assume you intend to use the extended Euclidian algorithm. For those intersted, overall, the complexity is O(k*log k) for computing C(n,k) if I'm not mistaking (the log k is for Euclid). I doubt there is a faster algorithm. – Catalin Pol Jan 18 '14 at 10:38