1

My basic question is why do the results differ for these four implementations of the factorial function and more specifically why do the functions start to differ for n=13?

library(Rcpp)
cppFunction('   int facCpp(int n)
                {
                    if (n==0) return 1;
                    if (n==1) return 1;
                    return n*facCpp(n-1);
                }
            ')



cppFunction('   double fac2Cpp(int n)
                {
                    if (n==0) return 1;
                    if (n==1) return 1;
                    return n*fac2Cpp(n-1);
                }
            ')

cppFunction('   long int fac3Cpp(long int n)
                {
                    if (n==0) return 1;
                    if (n==1) return 1;
                    return n*fac3Cpp(n-1);
                }
            ')

c(factorial(12),prod(1:12),facCpp(12),fac2Cpp(12),fac3Cpp(12))
c(factorial(13),prod(1:13),facCpp(13),fac2Cpp(13),fac3Cpp(13))
c(factorial(20),prod(1:20),facCpp(20),fac2Cpp(20),fac3Cpp(20))
c(factorial(40),prod(1:40),facCpp(40),fac2Cpp(40),fac3Cpp(40))

I realize that the question is perhaps a duplicate since an answers is probably suggested here Rcpp, creating a dataframe with a vector of long long which also shows suggests why the functions start to differ for f(13)

2^31-1>facCpp(12)
#> [1] TRUE
2^31-1>13*facCpp(12)
#> [1] FALSE


c(factorial(12),prod(1:12),facCpp(12),fac2Cpp(12),fac3Cpp(12))
#>[1] 479001600 479001600 479001600 479001600 479001600
c(factorial(13),prod(1:13),facCpp(13),fac2Cpp(13),fac3Cpp(13))
#> [1] 6227020800 6227020800 1932053504 6227020800 1932053504
c(factorial(20),prod(1:20),facCpp(20),fac2Cpp(20),fac3Cpp(20))
#> [1]  2.432902e+18  2.432902e+18 -2.102133e+09  2.432902e+18 -2.102133e+09
Community
  • 1
  • 1
  • 2
    12! overflows neither an `int` nor a `long int`. 13! overflows an `int` but not a `long int`. 40! overflows both. What you're seeing is overflow in your C++ code. – josliber Apr 17 '14 at 19:01
  • @josilber thx for the answer that clears up the C++ related part. But if you run the code you will notice that fac3Cpp(13) also returns a wrong result. So my guess is that this is related to the Rcpp package not supporting `long int` hence fac3Cpp becomes the facCpp. – Jesper Hybel Pedersen Apr 17 '14 at 19:15
  • 2
    @user2055639 fac3Cpp(13) returns the correct result on my machine. The C++ standard allows for this difference though. – Dason Apr 17 '14 at 19:20
  • Similarly to @Dason, `fac3Cpp(13)` gives the correct result on my computer. Please update your post with the output you're seeing to get help with the discrepancy you're seeing. – josliber Apr 17 '14 at 19:34
  • @Dason ad "fac3Cpp(13) returns the correct result on my machine" this puzzles me? "The C++ standard allows for this difference though." not sure I get what you are aiming at here?? I guess in the end I'm looking for a somewhat general working rule for when not to use `int`. So I guess that in general one can't be certain that `long int` would fix the problem. ... I will update my post with output. – Jesper Hybel Pedersen Apr 17 '14 at 19:35

1 Answers1

3

You are essentially doing this wrong. See the R help page for factorial:

‘factorial(x)’ (x! for non-negative integer ‘x’) is defined to be ‘gamma(x+1)’ and ‘lfactorial’ to be ‘lgamma(x+1)’.

You are not supposed to compute it this way. Why? Well look at this:

R> evalCpp("INT_MAX")
[1] 2147483647
R> 

You will hit numerical overflow. Hence the different algorithm as implemented eg in R's factorial() function which just does gamma(x+1). And you can do that in C++ too:

R> cppFunction('double myFac(int x) { return(R::gammafn(x+1.0)); }')
R> myFac(4)
[1] 24
R> myFac(12)
[1] 479001600
R> myFac(13)
[1] 6227020800
R> myFac(20)
[1] 2.4329e+18
R> myFac(40)
[1] 8.15915e+47
R> 
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • @Eddelbuettel thx for the answer it is important to see the way you are supposed to do it. I wasn't really aiming at a correct implementation of the factorial as just fooling around to get goint using Rcpp. If you have any thoughts as to why using `long int` in fac3Cpp returns the wrong answer please don't hold back? – Jesper Hybel Pedersen Apr 17 '14 at 20:19
  • Didn't have time to look in detail, and would suspect some OS and compiler-version dependence there as we all know what the correct answer should be. I'd set up algorithms such that you are *not* likely to get bitten by under- or overflow. [ BTW: It is Eddelbuettel with two Ds ] – Dirk Eddelbuettel Apr 17 '14 at 20:22
  • 1
    @user2055639 compare `evalCpp("LONG_MAX")` and `factorial(21)` -- you're still going to overflow past 20. `double` is able to store a much larger range of values. – Kevin Ushey Apr 17 '14 at 23:20