I initially wrote this (brute force and inefficient) method of calculating primes with the intent of making sure that there was no difference in speed between using "if-then-else" versus guards in Haskell (and there is no difference!). But then I decided to write a C program to compare and I got the following (Haskell slower by just over 25%) :
(Note I got the ideas of using rem instead of mod and also the O3 option in the compiler invocation from the following post : On improving Haskell's performance compared to C in fibonacci micro-benchmark)
Haskell : Forum.hs
divisibleRec :: Int -> Int -> Bool
divisibleRec i j
| j == 1 = False
| i `rem` j == 0 = True
| otherwise = divisibleRec i (j-1)
divisible::Int -> Bool
divisible i = divisibleRec i (i-1)
r = [ x | x <- [2..200000], divisible x == False]
main :: IO()
main = print(length(r))
C : main.cpp
#include <stdio.h>
bool divisibleRec(int i, int j){
if(j==1){ return false; }
else if(i%j==0){ return true; }
else{ return divisibleRec(i,j-1); }
}
bool divisible(int i){ return divisibleRec(i, i-1); }
int main(void){
int i, count =0;
for(i=2; i<200000; ++i){
if(divisible(i)==false){
count = count+1;
}
}
printf("number of primes = %d\n",count);
return 0;
}
The results I got were as follows :
Compilation times
time (ghc -O3 -o runProg Forum.hs)
real 0m0.355s
user 0m0.252s
sys 0m0.040s
time (gcc -O3 -o runProg main.cpp)
real 0m0.070s
user 0m0.036s
sys 0m0.008s
and the following running times :
Running times on Ubuntu 32 bit
Haskell
17984
real 0m54.498s
user 0m51.363s
sys 0m0.140s
C++
number of primes = 17984
real 0m41.739s
user 0m39.642s
sys 0m0.080s
I was quite impressed with the running times of Haskell. However my question is this : can I do anything to speed up the haskell program without :
- Changing the underlying algorithm (it is clear that massive speedups can be gained by changing the algorithm; but I just want to understand what I can do on the language/compiler side to improve performance)
- Invoking the llvm compiler (because I dont have this installed)
[EDIT : Memory usage]
After a comment by Alan I noticed that the C program uses a constant amount of memory where as the Haskell program slowly grows in memory size. At first I thought this had something to do with recursion, but gspr explains below why this is happening and provides a solution. Will Ness provides an alternative solution which (like gspr's solution) also ensures that the memory remains static.
[EDIT : Summary of bigger runs]
max number tested : 200,000:
(54.498s/41.739s) = Haskell 30.5% slower
max number tested : 400,000:
3m31.372s/2m45.076s = 211.37s/165s = Haskell 28.1% slower
max number tested : 800,000:
14m3.266s/11m6.024s = 843.27s/666.02s = Haskell 26.6% slower
[EDIT : Code for Alan]
This was the code that I had written earlier which does not have recursion and which I had tested on 200,000 :
#include <stdio.h>
bool divisibleRec(int i, int j){
while(j>0){
if(j==1){ return false; }
else if(i%j==0){ return true; }
else{ j -= 1;}
}
}
bool divisible(int i){ return divisibleRec(i, i-1); }
int main(void){
int i, count =0;
for(i=2; i<8000000; ++i){
if(divisible(i)==false){
count = count+1;
}
}
printf("number of primes = %d\n",count);
return 0;
}
The results for the C code with and without recursion are as follows (for 800,000) :
With recursion : 11m6.024s
Without recursion : 11m5.328s
Note that the executable seems to take up 60kb (as seen in System monitor) irrespective of the maximum number, and therefore I suspect that the compiler is detecting this recursion.