1

Issue

A console app with Biginteger calculation freezes

Details

I'm developing a console application in C# which tests whether very big numbers (10 to the power of several tens to hundreds) are prime. Since default integer types can deal with numbers only up to 10^19 (long, ulong), I'm using BitInteger class. But when I run the app in the debug mode in Visual Studio, the app freezes.

    static void Main(string[] args)
    {
        int exp = 100;
        var bi = BigInteger.Pow(10, exp);
        var sw = new Stopwatch();
        sw.Start();
        for (int i = 0; i < 1000; i++)
        {
            bi++;
            Console.WriteLine($"{i}th try : {bi} ({sw.Elapsed.ToString("mm\\:ss\\.ff")})");

            bool b = IsPrime(bi);
            if (b)
            {
                Console.WriteLine($"{bi} is a prime number");
            }

            //GC.Collect();
        }
        sw.Stop();

        Console.Read();
    }

    static private bool IsPrime(BigInteger n)
    {
        if (n <= 1)
            return false;
        else if (n <= 3)
            return true;
        else if (n % 2 == 0 || n % 3 == 0)
            return false;
        for (BigInteger i = 5; i * i <= n; i += 6)
        {
            if (n % i == 0 || n % (i + 2) == 0)
                return false;
        }

        return true;

    }

Whether the program freezes depends on the variable exp. I tested several values of exp.

  • (the values of exp) : (i at which the program froze)
  • 10 : didn't freeze (the app finished in less than a second
  • 15 : didn't freeze (however the console window refreshes only every 2 seconds)
  • 17 : didn't freeze (however the console window refreshes only every 17 seconds)
  • 20 : 39
  • 25 : 13
  • 30 : 37
  • 50 : 27
  • 100 : 37

It's weird that i doesn't monotonically go up as exp increases. So I ran the program for exp = 100 three times but got the same results. The numbers seem to be reproducible and reliable.

I know there are algorithms to test primality better than this one but I will try them later. First, I'd like to check the entire behavior of the program.

I googled "biginteger console freeze c#" for this issue and found two articles.

  1. MillerRabin primality test in C#
  2. C# BigInteger and int How to save memory?

The first one says "freeze" and "Biginteger" but the answer wasn't much helpful. The second one referred to saving memory so I thought the issue is about garbage collection. Then I added GC.Collect() at the end of the for loop (the commented out line) but this didn't solve the issue. I got the same result.

How should I solve this issue?

Environment

  • Windows 8
  • Visual Studio 2017
  • C#
  • .NET Framework 4.6.1
dixhom
  • 2,419
  • 4
  • 20
  • 36
  • mjwills, the programs is looping inside the for loop in `IsPrime()`. This was checked by putting a break point. – dixhom Nov 25 '17 at 12:17
  • The smallest factor of `100000000000000000000000000000000000000000000000000000000000‌​00000000000000000000‌​00000000000000000003‌​6` is obviously `2`. So `IsPrime()` should return `false` on `else if (n % 2 == 0 || n % 3 == 0)` and a line should be printed. I'm not sure about `000000000000000000000000000000000000000000000000000000000000‌​00000000000000000000‌​00000000000000000037`. I ran the program for about 20 minutes and the console didn't show any new line. Also, it doesn't make sense the program runs normally until `I`=35 and suddenly stops at `i`=36. The change in `bi` is pretty small. – dixhom Nov 25 '17 at 12:24
  • Possible duplicate of [Faster way to check if a number is a prime?](https://stackoverflow.com/questions/17579091/faster-way-to-check-if-a-number-is-a-prime) – mjwills Nov 25 '17 at 12:35
  • I understood `10^100 + 36` is very fast. So are `10^100 + 35`, `10^100 + 34`, `10^100 + 33`, `10^100 + 32`. But what about `10^100 + 31`? Or `10^100 + 27`? What's the reason you can be sure these are "fast since they have factors that are small" while `10^100 + 37` isn't? – dixhom Nov 25 '17 at 12:39
  • Okay, you were right. I also checked the smallest factors of the numbers. For `10^100 + n`, they were 23 for n=33, 7 for n=31 and 37 for n=27. The biggest one for n <37 was 3221 for n=9. For n=37, after 8 minutes of running, i was 4374334583 and still looping. So it was just that `10^100 + 37` happened to have a quite large smallest factor while other n wasn't. But it doesn't explain the issue of "the console window refreshes only every 17 seconds" for `exp`=17. I meant several lines appear on the console window then several more appear in 17 seconds. Not one line by one. Any ideas for this? – dixhom Nov 25 '17 at 13:08
  • 1
    Oh, I got it! So what's you are talking about is that line n pops up in 17 seconds, line n+1 in 17.001s, line n+2 in 17.002s, line n+3 in 17.003s (and the list goes on) so it looked like all the lines came out exactly in 17 seconds, which is not true. Thank you very much! – dixhom Nov 25 '17 at 13:16
  • I didn't know just saying "thank you" is not okay in SO, which is a fairly frequently used phrase to show appreciation and fundamental in human relationships. I guess SO didn't want comments inundated with ones only saying "Thank you." – dixhom Nov 25 '17 at 13:25
  • 1
    Factoring is known to be a hard problem to solve quickly; that's why it is the basis of cryptosystems. You can do much better than you are doing, but not *enormously* better in general. – Eric Lippert Nov 25 '17 at 22:31
  • You haven't said *why* you are checking enormous numbers for primality. What exactly are you trying to do? There might be a better way. For example, statistical primality tests are much faster, but do not guarantee primeness. – Eric Lippert Nov 25 '17 at 22:31

1 Answers1

1

Your algorithm basically says:

are you divisible by 2? what about 3? what about 5? 7? 11? 13? 17? 19?

etc etc

For small input values, checking all those permutations is fast. For large input values, it can be fast if the large input value has a small factor (e.g. 2, 3, 5 or 37).

But if the large input lacks a small factor (either because it is prime, or because its smallest factor is quite large), your algorithm has to do a lot of checking. Basically it has to check one in three (i.e. two out of every six) numbers up to the square root of the input (until it finds a match). For large numbers, this involves a lot of calculations.

If it is taking too long, you need to code a better / faster IsPrime algorithm. This answer may be helpful in that regard.

You could also consider storing a list of some of the known large prime numbers, to enable those numbers to be looked up quickly (e.g. from a database) rather than 'calculated'.

Re:

(however the console window refreshes only every 17 seconds)

This is because some of the inputs are indeed prime - and thus it takes 17 seconds or so to verify that (i.e. check every permutation). It looks to you like the console is refreshing only every 17 seconds, but instead it is calculating for 17 seconds. Then the subsequent ones calculate very quickly (since they aren't prime) - so it looks like they are coming out 'as a batch'.

mjwills
  • 23,389
  • 6
  • 40
  • 63
  • > "You could also consider storing a list of known large prime numbers, to enable those numbers to be looked up quickly (e.g. from a database) rather than 'calculated'." This is not practical. The rough estimate of the number of prime numbers from 1 to N is N/log(N). It's 10^97 for N=10^100. The size of the entire data will be at least 10^97 bytes. I doubt there's a server that can store that amount of data.But I could use a prime number list up to, like, 10000, to use only prime numbers for `i` in `for` loop. – dixhom Nov 25 '17 at 13:39
  • `This is not practical.` To store **all** of them would be impractical, yes. But you could store a _subset_ of them (e.g. some of the **large** ones - there isn't any point storing the small ones since they are quick to calculate). Ultimately it is an optimisation technique - you need to decide whether you want to throw CPU at the problem, or storage at the problem, or both. – mjwills Nov 25 '17 at 21:45
  • @dixhom: You **doubt** there is a server that can store that much data? That is over a billion times more than the number of atoms in the universe. Now would be a good time to remove all possible doubt on the matter. – Eric Lippert Nov 25 '17 at 22:56
  • Haha, thanks @EricLippert (to be fair, your comment is true for **all** prime numbers, I am not 100% sure it is true for all **known** prime numbers). – mjwills Nov 25 '17 at 23:07
  • @EricLippert couldn't agree with you more. And I **doubt** using all possible particles including photons in the universe would be enough :) – dixhom Nov 26 '17 at 00:54
  • @mjwills uhmmm... I don't think using only a subset of large prime numbers would be practical. Let's say we are going to check the numbers from 10^10 to 10^100. But there are about 3.9 x 10^97 prime numbers in that range. So storing a billion(10^9) or trillion(10^12) prime numbers in the range does reduce the calculation cost, but **only by a extremely tiny percentage of it**. It's not so effective. – dixhom Nov 26 '17 at 01:01
  • Fair enough @dixhom – mjwills Nov 26 '17 at 01:48