i wonder why methods in the "tail call" pattern perform measurably better in Java then such which use a classic recursion pattern.
Given the following example:
public static long facultyApproachPureRec(long n)
{
if(n == 0 || n == 1)
{
return 1;
}
long temp = facultyApproachPureRec(n-1);
return (n*temp);
}
public static long facultyApproachTailEnd(long n, long m)
{
if(n == 0 || n == 1)
{
return m;
}
long temp = n*m;
return facultyApproachTailEnd(n-1, temp);
}
the facultyApproachTailEnd is always faster then the pure recursion solution. This makes me wonder why because Java doesnt support tail call optimization (which i proved: i can get a StachOverflowError easily). Because of this fact I would assume the tail call solution to perform worse then the pure recursive soltion because we always need to pass the double amount of variables.
Can somebody explain me why the Benchmarks indicate that the tail call soltion is always faster than the oter one?
Note: In c# it behaves like i would expect it to behave: the tail call solution is always slower than the pure recursive solution.
Benchmark Code:
public static void main(String[] args)
{
for(int k = 0; k < 10; k++) {
long begin = System.nanoTime();
for (int i = 0; i < 20000; i++) {
if (facultyApproachPureRec(i) == 0)
{
i++;
}
}
long end = System.nanoTime();
long elapsed = TimeUnit.MILLISECONDS.convert((end - begin), TimeUnit.NANOSECONDS);
System.out.println("Pure recursion took " + elapsed);
begin = System.nanoTime();
for (int i = 0; i < 20000; i++) {
if(facultyApproachTailEnd(i, 1) == 0)
{
i++;
}
}
end = System.nanoTime();
elapsed = TimeUnit.MILLISECONDS.convert((end - begin), TimeUnit.NANOSECONDS);
System.out.println("Tail end took " + elapsed);
}
}
Some results (validated on 3 machines, all are more or less the same):
Pure recursion took 623
Tail end took 461
Pure recursion took 570
Tail end took 418
Pure recursion took 913
Tail end took 581
Pure recursion took 580
Tail end took 417
Pure recursion took 607
Tail end took 411
Pure recursion took 601
Tail end took 417
Pure recursion took 769
Tail end took 372
Pure recursion took 527
Tail end took 568
Pure recursion took 542
Tail end took 419
Pure recursion took 632
Tail end took 333
Thanks for all answers.
Edit: I tested this a lot more often (1000 iterations, directly via cmd (so no IDE)) before writing this question. If my test is wrong im sorry. I didnt use Java very much outside the university.
Edit2: I appended the generated ByteCode below. I am not familiar with Java Byte Code (i am more used to clr), but from a short view they look pretty equal except that the tail call solution has some more code inside and uses/maintains more variables.
public static facultyApproachPureRec(J)J
L0
LINENUMBER 39 L0
LLOAD 0
LCONST_0
LCMP
IFEQ L1
LLOAD 0
LCONST_1
LCMP
IFNE L2
L1
LINENUMBER 41 L1
FRAME SAME
LCONST_1
LRETURN
L2
LINENUMBER 44 L2
FRAME SAME
LLOAD 0
LCONST_1
LSUB
INVOKESTATIC Logik/Rekursion/MainLogik.facultyApproachPureRec (J)J
LSTORE 2
L3
LINENUMBER 45 L3
LLOAD 0
LLOAD 2
LMUL
LRETURN
L4
LOCALVARIABLE n J L0 L4 0
LOCALVARIABLE temp J L3 L4 2
MAXSTACK = 4
MAXLOCALS = 4
// access flags 0x9
public static facultyApproachTailEnd(JJ)J
L0
LINENUMBER 50 L0
LLOAD 0
LCONST_0
LCMP
IFEQ L1
LLOAD 0
LCONST_1
LCMP
IFNE L2
L1
LINENUMBER 52 L1
FRAME SAME
LLOAD 2
LRETURN
L2
LINENUMBER 55 L2
FRAME SAME
LLOAD 0
LLOAD 2
LMUL
LSTORE 4
L3
LINENUMBER 56 L3
LLOAD 0
LCONST_1
LSUB
LLOAD 4
INVOKESTATIC Logik/Rekursion/MainLogik.facultyApproachTailEnd (JJ)J
LRETURN
L4
LOCALVARIABLE n J L0 L4 0
LOCALVARIABLE m J L0 L4 2
LOCALVARIABLE temp J L3 L4 4
MAXSTACK = 4
MAXLOCALS = 6
}