86

The method BigInteger.isProbablePrime() is quite strange; from the documentation, this will tell whether a number is prime with a probability of 1 - 1 / 2^arg, where arg is the integer argument.

It has been present in the JDK for quite a long time, so it means it must have uses. My limited knowledge in computer science and algorithms (and maths) tells me that it does not really make sense to know whether a number is "probably" a prime but not exactly a prime.

So, what is a possible scenario where one would want to use this method? Cryptography?

fge
  • 119,121
  • 33
  • 254
  • 329
  • 4
    http://en.wikipedia.org/wiki/Probable_prime – Kon Dec 11 '14 at 18:41
  • 6
    Also, [Miller-Rabin primality test](http://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test). The main advantage is _speed_. E.g. when you want to check for factors you can do such a test to speed up the factoring process. You can keep the "probably" part of it quite low, and it's useful in practice. But I agree that it's a bit shaky and weird, like floats. – keyser Dec 11 '14 at 18:41
  • @Kon that does not tell anything about when you want to use them – fge Dec 11 '14 at 18:42
  • Probably a trade-off between accuracy and execution time. – maxx777 Dec 11 '14 at 18:42
  • 2
    @maxx777 that's a given -- I ask for an actual use case – fge Dec 11 '14 at 18:42
  • @fge Well, if that's a given then factoring is a big use case. And, as mentioned: security. Primes are a big thing there. – keyser Dec 11 '14 at 18:44
  • 1
    @fge http://stackoverflow.com/questions/23144236/use-of-biginteger-isprobableprime-to-generate-cryptographically-secure-primes – SSC Dec 11 '14 at 18:44
  • This uses Rabin-Miller algorithm and the function javadocs state "The Rabin-Miller algorithm can be found on pp. 259-261 of "Applied 1292: * Cryptography, Second Edition" by Bruce Schneier." – SSC Dec 11 '14 at 18:46
  • Also few bugs are reported: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4624738 – SSC Dec 11 '14 at 18:49
  • 4
    I'd really like the downvoters to explain the reasons behind the downvotes, please – fge Dec 11 '14 at 18:54
  • @keyser I have no doubt about it; I simply wonder why one would be satisfied with a "probable prime". – fge Dec 11 '14 at 18:54
  • 1
    @fge: Every prime of the size used in RSA is a probable prime, even those are "proven" prime. If this seems like a contradiction, think about the probability of a random bit error in computer calculation. These bit errors can occur in the process of running an algorithm that purports to output provably prime numbers. Now, what might be the probability of such an error? Might it be around 2**(-k)? Then running a probabilistic algorithm that produces a prime with around that certainty is essentially just as good. – President James K. Polk Dec 11 '14 at 22:44
  • 17
    "It has been present in the JDK for quite a long time, so it means it must have uses." - or it was added for a useless reason, then not removed because nothing is ever removed. – user253751 Dec 12 '14 at 00:07

6 Answers6

69

Yes, this method can be used in cryptography. RSA encryption involves the finding of huge prime numbers, sometimes on the order of 1024 bits (about 300 digits). The security of RSA depends on the fact that factoring a number that consists of 2 of these prime numbers multiplied together is extremely difficult and time consuming. But for it to work, they must be prime.

It turns out that proving these numbers prime is difficult too. But the Miller-Rabin primality test, one of the primality tests uses by isProbablePrime, either detects that a number is composite or gives no conclusion. Running this test n times allows you to conclude that there is a 1 in 2n odds that this number is really composite. Running it 100 times yields the acceptable risk of 1 in 2100 that this number is composite.

rgettman
  • 176,041
  • 30
  • 275
  • 357
  • It's Rabin-Miller, no? – SSC Dec 11 '14 at 18:48
  • 3
    @Mr.777 I've seen Rabin-Miller once or twice but Miller-Rabin tens of times. I'm not sure if there is an official name though. – keyser Dec 11 '14 at 18:49
  • 3
    @Mr.777 The Wikipedia page I linked above states "Miller-Rabin" first, but acknowledges both names: "The Miller–Rabin primality test or Rabin–Miller primality test". – rgettman Dec 11 '14 at 18:51
  • 5
    The implementation of `isProbablyPrime` is (as far as I can tell) fully deterministic. How would running the test `n` times improve the odds of a correct result? (Even if it were an element of randomness, one would need to have the randomness of multiple calls be independent to affect the risk in the way you describe.) – Ted Hopp Dec 11 '14 at 18:53
  • 11
    @TedHopp The implementation uses a random generator, and each round with a new random numbers gives a 3/4 chance of detecting a composite. The default generator is SecureRandom, with strong randomness guarantees. – that other guy Dec 11 '14 at 19:02
  • @TedHopp The method, using the Miller-Rabin primality test, uses random numbers as tests to determine if the number is composite (a "composite witness" IIRC, so it's technically not deterministic. Each time an iteration doesn't find a composite witness, the odds that it's prime go up. I can't claim to understand fully why, but each iteration not finding a composite witness cuts the odds that it's composite by a factor of a half. – rgettman Dec 11 '14 at 19:03
  • 4
    It may be difficult, however remember that PRIMES is in P. The AKS test may be slower than Miller-Rabin but there isn't an exponential difference, or a polynomial between them. You can use Miller-Rabin to find a bunch of probable primes and use AKS to definitely proove that they are primes. – Bakuriu Dec 12 '14 at 10:43
  • @Bakuriu And I'm sure if they finally invent a time machine to go back in time to around 1996 they'd change that ;) (and quite a few other things too please). That paper was only published in 2002. – Voo Dec 12 '14 at 18:48
  • Actually, running it `N` times yields an error probability of less than `4^-N` *not* `2^-N`. In practice, the error probability is much, much lower still. – Brett Hale Dec 13 '14 at 11:42
  • @Voo, no, they wouldn't change that. As of 2014, AKS is far too slow to be useful for any practical purpose. Bakuriu, I think you should look at actual run times. The difference between 5 milliseconds and 500 *years* is of great practical importance. – DanaJ Dec 15 '14 at 20:19
21

If the test tells you an integer is not prime, you can certainly believe that 100%.

It is only the other side of the question, if the test tells you an integer is "a probable prime", that you may entertain doubt. Repeating the test with varying "bases" allows the probability of falsely succeeding at "imitating" a prime (being a strong pseudo-prime with respect to multiple bases) to be made as small as desired.

The usefulness of the test lies in its speed and simplicity. One would not necessarily be satisfied with the status of "probable prime" as a final answer, but one would definitely avoid wasting time on almost all composite numbers by using this routine before bringing in the big guns of primality testing.

The comparison to the difficulty of factoring integers is something of a red herring. It is known that the primality of an integer can be determined in polynomial time, and indeed there is a proof that an extension of Miller-Rabin test to sufficiently many bases is definitive (in detecting primes, as opposed to probable primes), but this assumes the Generalized Riemann Hypothesis, so it is not quite so certain as the (more expensive) AKS primality test.

hardmath
  • 8,753
  • 2
  • 37
  • 65
  • 4
    It's worth noting that AKS was only discovered in August 2002, whilst this method has been in JDK since February 2002 – James_pic Dec 12 '14 at 13:52
  • 3
    No, wait, this has been in the JDK since February 1997 (I was looking at the `probablePrime` method, not the `isProbablePrime` method) – James_pic Dec 12 '14 at 14:00
  • 1
    Indeed, the title of Agrawal, Kayal and Saxena's 2002 paper "PRIMES is in P" signals the first *unconditional* proof of polynomial (in bit length of *n*) complexity for deterministic (general integer) primality testing. Miller (1975) had shown that, [assuming GRH](http://dl.acm.org/citation.cfm?doid=800116.803773), the primality of an integer can be tested deterministically in steps proportional to the fourth power of the bit length, a much better exponent than is currently known for AKS or its variants. – hardmath Dec 12 '14 at 14:06
  • While AKS is asymptotically faster, methods like ECPP would be much more efficient for 'cryptographic' or 'industrial' primes. – Brett Hale Dec 13 '14 at 11:54
  • 2
    AKS is insanely slow, and will not be faster than APR-CL for any number computable in geological scale time, much less human scale. APR-CL and ECPP were already around in 1997. As Brett mentions, ECPP is a good choice if we want a proof. All of these are slow compared to probable prime methods (e.g. M-R, BPSW, Frobenius). – DanaJ Dec 15 '14 at 20:15
20

The standard use case for BigInteger.isProbablePrime(int) is in cryptography. Specifically, certain cryptographic algorithms, such as RSA, require randomly chosen large primes. Importantly, however, these algorithms don't really require these numbers to be guaranteed to be prime — they just need to be prime with a very high probability.

How high is very high? Well, in a crypto application, one would typically call .isProbablePrime() with an argument somewhere between 128 and 256. Thus, the probability of a non-prime number passing such a test is less than one in 2128 or 2256.

Let's put that in perspective: if you had 10 billion computers, each generating 10 billion probable prime numbers per second (which would mean less than one clock cycle per number on any modern CPU), and the primality of those numbers was tested with .isProbablePrime(128), you would, on average, expect one non-prime number to slip in once in every 100 billion years.

That is, that would be the case, if those 10 billion computers could somehow all run for hundreds of billions of years without experiencing any hardware failures. In practice, though, it's a lot more likely for a random cosmic ray to strike your computer at just the right time and place to flip the return value of .isProbablePrime(128) from false to true, without causing any other detectable effects, than it is for a non-prime number to actually pass the probabilistic primality test at that certainty level.

Of course, the same risk of random cosmic rays and other hardware faults also applies to deterministic primality tests like AKS. Thus, in practice, even these tests have a (very small) baseline false positive rate due to random hardware failures (not to mention all other possible sources of errors, such as implementation bugs).

Since it's easy to push the intrinsic false positive rate of the Miller–Rabin primality test used by .isProbablePrime() far below this baseline rate, simply by repeating the test sufficiently many times, and since, even repeated so many times, the Miller–Rabin test is still much faster in practice than the best known deterministic primality tests like AKS, it remains the standard primality test for cryptographic applications.

(Besides, even if you happened to accidentally select a strong pseudoprime as one of the factors of your RSA modulus, it would not generally lead to a catastrophic failure. Typically, such pseudoprimes would be products of two (or rarely more) primes of approximately half the length, which means that you'd end up with a multi-prime RSA key. As long as none of the factors were too small (and if they were, the primality test should've caught them), the RSA algorithm will still work just fine, and the key, although somewhat weaker against certain types of attacks than normal RSA keys of the same length, should still be reasonably secure if you didn't needlessly skimp on the key length.)

Community
  • 1
  • 1
Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
  • The fault issue is one reason why AKS isn't actually used (the astonishingly slow speed is the other), and ECPP is more common. As you note, implementation errors in the algorithms are quite possible, so having a certificate verified with independent code is helpful. – DanaJ Dec 15 '14 at 20:06
8

A possible use case is in testing primality of a given number (at test which in itself has many uses). The isProbablePrime algorithm will run much faster than an exact algorithm, so if the number fails isProbablePrime, then one need not go to the expense of running the more expensive algorithm.

Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • So, is that for practicality purposes then? And due to the fact that prime factorization is an NP problem? – fge Dec 11 '14 at 18:57
  • @fge - Yes, the use case I proposed is for practicality. I don't know that this helps with prime factorization, which is a significantly more difficult problem than testing primality. For the latter, there is a polynomial-time algorithm: the [AKS primality test](http://en.wikipedia.org/wiki/AKS_primality_test). – Ted Hopp Dec 11 '14 at 19:03
  • 5
    @fge: Factorization is indeed in NP, but I suspect you meant "NP-complete", which factorization is _not_ known to be. On the contrary it is strongly suspected _not_ to be NP-hard. – hmakholm left over Monica Dec 11 '14 at 20:08
6

Finding probable primes is an important problem in cryptography. It turns out that a reasonable strategy for finding a probable k-bit prime is to repeatedly select a random k-bit number, and test it for probable primality using a method like isProbablePrime().

For further discussion, see section 4.4.1 of the Handbook of Applied Cryptography.

Also see On generation of probable primes by incremental search by Brandt and Damgård.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
5

Algorithms such as RSA key generation rely on being able to determine whether a number is prime or not.

However, at the time that the isProbablePrime method was added to the JDK (February 1997), there was no proven way to deterministically decide whether a number was prime in a reasonable amount of time. The best known approach at that time was the Miller-Rabin algorithm - a probabilistic algorithm that would sometimes give false positives (i.e, would report non-primes as primes), but could be tuned to reduce the likelihood of false positives, at the expense of modest increases in runtime.

Since then, algorithms have been discovered that can deterministically decide whether a number is prime reasonably quickly, such as the AKS algorithm that was discovered in August 2002. However, it should be noted that these algorithms are still not as fast as Miller-Rabin.

Perhaps a better question is why no isPrime method has been added to the JDK since 2002.

James_pic
  • 3,240
  • 19
  • 24
  • Thanks for the historical perspective! Looks like @immibis was on the right track with his comment about "in the JDK but never removed", then? :) – fge Dec 12 '14 at 14:31
  • 1
    I know Java famously never removes stuff from the standard library, but I'm not sure they'd remove it even if they could. For some applications, being 99.999999999% sure something's prime is good enough, and is much faster than being 100% sure. – James_pic Dec 12 '14 at 15:09