11

I'm using gcc 4.6.1 and am getting some interesting behavior involving calling a constexpr function. This program runs just fine and straight away prints out 12200160415121876738.

#include <iostream>

extern const unsigned long joe;

constexpr unsigned long fib(unsigned long int x)
{
   return (x <= 1) ? 1 : (fib(x - 1) + fib(x - 2));
}

const unsigned long joe = fib(92);

int main()
{
   ::std::cout << "Here I am!\n";
   ::std::cout << joe << '\n';
   return 0;
}

This program takes forever to run and I've never had the patience to wait for it to print out a value:

#include <iostream>

constexpr unsigned long fib(unsigned long int x)
{
   return (x <= 1) ? 1 : (fib(x - 1) + fib(x - 2));
}

int main()
{
   ::std::cout << "Here I am!\n";
   ::std::cout << fib(92) << '\n';
   return 0;
}

Why is there such a huge difference? Am I doing something wrong in the second program?

Edit: I'm compiling this with g++ -std=c++0x -O3 on a 64-bit platform.

Omnifarious
  • 54,333
  • 19
  • 131
  • 194

4 Answers4

14

joe is an Integral Constant Expression; it must be usable in array bounds. For that reason, a reasonable compiler will evaluate it at compile time.

In your second program, even though the compiler could calculate it at compile time, there's no reason why it must.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 4
    Maybe the first program takes longer to compile? :) – Torp Aug 15 '11 at 13:06
  • 2
    Would be interesting to see if the compiler *does* compute it at compile time if you enable optimisations. – Kerrek SB Aug 15 '11 at 13:08
  • It seems a bit remiss for it not to, a missed optimization opportunity. But you're right of course. :-/ And @Kerrek, I'm compiling it with optimizations on. – Omnifarious Aug 15 '11 at 13:08
  • 2
    @Torp: Perhaps not noticeably, if the compiler is in effect memoizing the function when it computes it at compile time. That's about all I can think of for why running the second program takes longer than Omnifarious is willing to wait, but compiling the first one presumably is much faster. – Steve Jessop Aug 15 '11 at 13:11
  • 1
    @Torp - Compiling the first program does not take a noticeably different amount of time than compiling the second. I believe that Steve Jessop's surmise is likely correct. – Omnifarious Aug 15 '11 at 13:15
  • 1
    Then I guess it means that even though `constexpr` helps, when you want to be sure that something *will* be computed at compile-time, you need to force it. – Matthieu M. Aug 15 '11 at 14:28
  • Your's is the best answer for it's succinctness and clarity. – Omnifarious Aug 15 '11 at 20:51
  • Thanks, but I didn't even give an answer! – user2023370 Mar 20 '12 at 22:31
4

My best guess would be that program number one is had fib(92) evaluated at compile time, with lots of tables and stuff for the compiler to keep track of what values has already been evaluated... making running the program almost trivial,

Where as the second version is actually evaluated at run-time without lookup tables of evaluated constant expressions, meaning that the evaluating of fib(92) makes something like 2**92 recursive calls.

In other words the compiler does not optimize the fact that fib(92) is a constant expression.

Martin Kristiansen
  • 9,875
  • 10
  • 51
  • 83
  • 1
    Actually I think it makes fib(92) recursive calls, so not as much as 2**92 but still too long to be in any way usable. – visitor Aug 15 '11 at 13:15
  • 2
    @visitor - Looking at the assembly output (which for the second program is a surprisingly large mass of code) it looks like it will be many fewer because the cases for `fib(1)` through `fib(6)` or so are handled by special cases. – Omnifarious Aug 15 '11 at 13:18
  • @visitor: you almost fooled me into thinking I was wrong :-) If you look at the recurtion tree for the execution you will see that there is fib(92) leafs, meaning that there must be something in the oder for fib(92) nodes if the tree is balanced, its not, but the approximation would be about right, given a factor. so not 2**92, but not fib(92) either. :-) – Martin Kristiansen Aug 15 '11 at 13:19
  • 1
    @Martin: *grin* (S)he missed all the internal nodes of the call tree. :-) – Omnifarious Aug 15 '11 at 13:22
2

There's wiggle room for the compiler to decide not to evaluate at compile time if it thinks something is "too complicated". That's in cases where it's not being absolutely forced to do the evaluation in order to generate a correct program that can actually be run (as @MSalters points out).

I thought perhaps the decision affecting compile-time laziness would be the recursion depth limit. (That's suggested in the spec as 512, but you can bump it up with the command line flag -fconstexpr-depth if you wanted it to.) But rather that would control it giving up in any cases...even when a compile time constant was necessary to run the program. So no effect on your case.

It seems if you want a guarantee in the code that it will do the optimization then you've found a technique for that. But if constexpr-depth can't help, I'm not sure if there are any relevant compiler flags otherwise...

  • In this case, a depth of 91 is sufficient. And the compiler doesn't actually take very long to compile the first one. Not noticeably longer than the second one at any rate. `-fconstexpr-depth` is used for configuring the maximum depth before the compiler gives up, not the depth that's used when deciding whether or not to optimize a call out of existence. – Omnifarious Aug 15 '11 at 13:27
  • Ah, I wasn't paying enough attention. Still it's probably helpful to have a mention of `constexpr-depth` somewhere in this thread...edited to correct. – HostileFork says dont trust SE Aug 15 '11 at 13:34
  • Your's is the second best answer because while the best one is quite succinct and clear, it's missing details and recommendations. – Omnifarious Aug 15 '11 at 20:51
  • Thanks! For most all my answers I try to do that. It's one thing to solve the problem at the precise level of detail for the situation the specific questioner had to get them over a hurdle. It's another thing to think about the general usefulness of the answer spread when someone finds it later via Google with different knowledge (and/or not-exactly-the-same-problem). After all, there's only one initial questioner and who-knows-how-many page viewers in the future!! – HostileFork says dont trust SE Aug 15 '11 at 22:29
2

I also wanted to see how gcc did optimize the code for this new constexpr keyword, and actually it's just because you are calling fib(92) as parameter of the ofstream::operator<<

::std::cout << fib(92) << '\n';

that it isn't evaluated at the compilation time, if you try calling it not as a parameter of another function (like you did in)

const unsigned long joe = fib(92);

it is evaluated at compile time, I did a blog post about this if you want more info, I don't know if this should be mentioned to gcc developers.

b3nj1
  • 667
  • 1
  • 6
  • 17
  • 1
    I did mention it, and it presents an interesting and tricky optimization problem. There are two very different code paths taken by those cases since one HAS to be evaluated at compile time and the other doesn't. – Omnifarious Mar 09 '12 at 17:17