-4

I compiled the following short C++ code below with gcc compiler on Windows

#include <iostream>

#define MAXN 1000000009

using namespace std;

bool prime[MAXN] = {true};

int main()
{
    for (int p = 2; p * p <= MAXN; ++p)
        if (prime[p])
        {
            cout << p << "\n";

            for (int i = p * p; i <= MAXN; i += p)
                prime[i] = false;
        }

    return 0;
}

The code above prints prime numbers less than 1e9 with Sieve of Eratosthenes and doesn't have any errors, but I noticed that the compilation takes even longer time than compilation of other of my C++ programs, it took about few minutes to finish with no error messages. To verify, I got the same result when I compiled again. Even weirder, after the compilation, the code did not print anything and returned an exit code of 0. I did not understand what's happening. What caused this strange behaviour?

Klaus Gütter
  • 11,151
  • 6
  • 31
  • 36
Jimmy Y.
  • 91
  • 7
  • 1
    Your program has undefined behaviour. There is no element with the index of `MAXN` in the `prime` array. – n. m. could be an AI Jan 01 '19 at 07:08
  • 2
    `bool prime[MAXN] = {true};` will not set all of the elements to true. Also probably the source of the slow compiles. That's a lot of values to have to set. Check the size of the output file. It could be massive. – user4581301 Jan 01 '19 at 07:09
  • Well, your program flat out causes clang to crash, even with the UB removed (http://coliru.stacked-crooked.com/a/c36771879c940c32). That `1000000009` seems to be the culprit, I'm not sure how though. – StoryTeller - Unslander Monica Jan 01 '19 at 07:10
  • Read [How to debug small programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/). BTW don't confuse "compilation" with "complication". At last, the *compilation* of your program is quick, it is its *execution* which can take a lot of time. – Basile Starynkevitch Jan 01 '19 at 07:11
  • Just checked. GCC 7.2 produces an output file of about a 1 GB in size. Almost certainly why it take so long to compile. – user4581301 Jan 01 '19 at 07:12
  • 1
    Notice that your `MAXN` is probably much too big. Read more about [prime numbers](https://en.wikipedia.org/wiki/Prime_number) and [primality test](https://en.wikipedia.org/wiki/Primality_test). To find all primes less than 1e10 you only need an array of size 1e5 – Basile Starynkevitch Jan 01 '19 at 07:15
  • 2
    @StoryTeller Clang doesn't crash on my machine. Perhaps coliru limits memory. – n. m. could be an AI Jan 01 '19 at 07:16
  • On my Linux desktop the compilation of the above program (with GCC 8, using `g++ p.cc -o p.bin`) takes 8 seconds. That is *certainly not forever*. Of course, the program is crappy. – Basile Starynkevitch Jan 01 '19 at 07:20
  • 3
    Consider not initializing `prime`. It's a global so it will be automatically set to false. Then invert the program logic so the program treats false as a prime. Comment the smurf out of this bit of insanity to explain why you're doing this. It should speed up the build time enormously. – user4581301 Jan 01 '19 at 07:23
  • Upgrading the compiler is a better option, to be sure. – user4581301 Jan 01 '19 at 08:00
  • @user4581301: no need to upgrade the compiler, as long as it is C++11 compliantr – Basile Starynkevitch Jan 01 '19 at 09:35

1 Answers1

9

First, it is not a complication (your program is not stellar, but simple) but a compilation.

On my Linux desktop (with an i5-4690s and GCC 8 under Debian/Sid) the compilation with g++ p.cc -o p.bin of your less than optimal code takes about 8.2 seconds (certainly not forever).

Why does it take that much? Because you have an initialized but huge global variable (your prime array of booleans). So the compiler has to generate a huge binary executable (of about a gigabyte, most of it being the .data segment).

If you don't initialize that global statically, but only inside your main function, by having

bool prime[MAXN];
int main() {
  prime[2] = true;

the generated executable becomes tiny -about 17Kbytes- (because now prime sits in the .bss segment) and the compilation time is about 0.45 seconds.

If you wanted to initialize all the elements in prime you should use a for loop (or perhaps some std::memset, when initializing a range of elements to the same value).

You should think more before coding. Read about prime numbers and primality tests. You don't need such a big prime array because you are looping about √ MAXN times (which is the maximal index you need), i.e. 31623 times. So you could have declared bool prime[32000]; only (and change your program to only use elements in that range).

In practice, when you need a lot of memory, you'll better use heap memory.

If coding in C++, take advantage of its standard containers.

Of course, read How to debug small programs.

BTW, your program is very inefficient. It takes about 10 seconds to run (as p.bin > /tmp/p.out with redirection to /tmp/p.out which gets 3401 lines). For comparison, the BSD primes program (in /usr/games/primes on my Debian), when run as primes 2 31608 > /tmp/pp.out, takes less than 2 milliseconds to produce exactly the same output file, so runs nearly 5000 times faster than yours.

As commented by n.m. your original program has undefined behavior (you should be scared), because the primes[MAXN] element does not exist (you have a buffer overflow) but you are wrongly using it. So replace the inner <= MAXN with < MAXN or declare bool prime[MAXN+1];

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547