I wanted to check whether g++ supports tail calling so I wrote this simple program to check it: http://ideone.com/hnXHv
using namespace std;
size_t st;
void PrintStackTop(const std::string &type)
{
int stack_top;
if(st == 0) st = (size_t) &stack_top;
cout << "In " << type << " call version, the stack top is: " << (st - (size_t) &stack_top) << endl;
}
int TailCallFactorial(int n, int a = 1)
{
PrintStackTop("tail");
if(n < 2)
return a;
return TailCallFactorial(n - 1, n * a);
}
int NormalCallFactorial(int n)
{
PrintStackTop("normal");
if(n < 2)
return 1;
return NormalCallFactorial(n - 1) * n;
}
int main(int argc, char *argv[])
{
st = 0;
cout << TailCallFactorial(5) << endl;
st = 0;
cout << NormalCallFactorial(5) << endl;
return 0;
}
When I compiled it normally it seems g++ doesn't really notice any difference between the two versions:
> g++ main.cpp -o TailCall
> ./TailCall
In tail call version, the stack top is: 0
In tail call version, the stack top is: 48
In tail call version, the stack top is: 96
In tail call version, the stack top is: 144
In tail call version, the stack top is: 192
120
In normal call version, the stack top is: 0
In normal call version, the stack top is: 48
In normal call version, the stack top is: 96
In normal call version, the stack top is: 144
In normal call version, the stack top is: 192
120
The stack difference is 48 in both of them, while the tail call version needs one more
int. (Why?)
So I thought optimization might be handy:
> g++ -O2 main.cpp -o TailCall
> ./TailCall
In tail call version, the stack top is: 0
In tail call version, the stack top is: 80
In tail call version, the stack top is: 160
In tail call version, the stack top is: 240
In tail call version, the stack top is: 320
120
In normal call version, the stack top is: 0
In normal call version, the stack top is: 64
In normal call version, the stack top is: 128
In normal call version, the stack top is: 192
In normal call version, the stack top is: 256
120
The stack size increased in both cases, and while the compiler might think my CPU is slower than my memory (which its not anyway), I don't know why 80 bytes are necessary for a simple function. (Why is it?).
There tail call version also takes more space than the normal version, and its completely logical if an int has the size of 16 bytes. (no, I don't own a 128 bit CPU).
Now thinking what reason the compiler has not to tail call, I thought it might be exceptions, because they depend on the stack tightly. So I tried without exceptions:
> g++ -O2 -fno-exceptions main.cpp -o TailCall
> ./TailCall
In tail call version, the stack top is: 0
In tail call version, the stack top is: 64
In tail call version, the stack top is: 128
In tail call version, the stack top is: 192
In tail call version, the stack top is: 256
120
In normal call version, the stack top is: 0
In normal call version, the stack top is: 48
In normal call version, the stack top is: 96
In normal call version, the stack top is: 144
In normal call version, the stack top is: 192
120
Which cut the normal version back to non-optimized stack size, while the optimized one has 8 bytes over it. still an int is not 8 bytes.
I thought there is something I missed in c++ that needs the stack arranged so I tried c: http://ideone.com/tJPpc
Still no tail calling, but the stack is much smaller (32 bit each frame in both version).
Then I tried with optimization:
> gcc -O2 main.c -o TailCall
> ./TailCall
In tail call version, the stack top is: 0
In tail call version, the stack top is: 0
In tail call version, the stack top is: 0
In tail call version, the stack top is: 0
In tail call version, the stack top is: 0
120
In normal call version, the stack top is: 0
In normal call version, the stack top is: 0
In normal call version, the stack top is: 0
In normal call version, the stack top is: 0
In normal call version, the stack top is: 0
120
Not only it tail call optimized the first, it also tail call optimized the second!
Why doesn't g++ do tail call optimization while its clearly available in the platform? is there any way to force it?