6

I'm a physicist, and I've been doing work with Fortran a lot recently. Originally i used Java extensively for recreation because it was the first language I learned, but I've abandoned it for Fortran and C++. I have an amateur passion for prime numbers and so I created a prime number sieve. I was able to find all the prime numbers up to 2^31 in 15 seconds. This is the maximum array size for Java so that was the end of that. I ported the code carefully( I mean CAREFULLY, I was so upset my code was slow and that I couldn't find an error that I ported my Fortran code back into Java to verify that it wasn't my fault and then ported it back to Fortran, erasing each iteration!). The problem is that around 800,000,000 Fortran will grind to a halt. Up to that point its beating Java, but after that it's insanely slow. I spent a few hours graphing it and fitting a curve. It slows exponentially and could take hundreds of years to solve up to Javas level. I've asked many people to no avail. Why is this happening to me?!?! Are there any wise Fortran coders who can help me? I'm running a Macbook Pro late 2013 i5. My code is below.

program sieve
integer(4),allocatable:: prime(:)
integer(4)::a,max,b,primeCount
write(*,*)"Welcome to the slow prime number sieve!"
write(*,*)"--------------------------------------------"
write(*,*)"Up to what numbers do you need to find primes for?"
write(*,*)"Enter a number below 2^(32-1)"
read*, max

primeCount=0
allocate(prime(max))
prime(1)=1

    do a=2,int(sqrt(real(max))) !the main loop
        if(prime(a)==0)then !if the number is marked as prime procede
            do b=2*a,max,a  !eliminate all the numbers that are multiples of the number
                if(prime(b)==0)then !but only spend time eliminating if the number is still marked prime
                    prime(b)=1
                end if
            end do
        end if
    end do

do a=1,max
    if(prime(a)==0)then
        primeCount=primeCount+1
    end if
end do

print*, primeCount

end program

EDIT: My machine has 4 Gigs of ram installed

aPhysicist
  • 163
  • 4
  • 3
    Though I don't think it affects the run time of your code I think you have made an error; it looks to me as if you expect the elements of `prime` to be initialised to `0`. You may have used compilation options to do that, but by default Fortran programs don't initialise variables, that's the programmer's responsibility. – High Performance Mark Jul 29 '14 at 06:08
  • 8
    How much RAM do you have on your machine? An array of 800,000,000 `kind=4` integers takes up roughly 3GB. If you exceed your RAM, the array will be cached on disk... – Alexander Vogt Jul 29 '14 at 06:29
  • Are you using your compiler's optimisation? I get a ~25% speedup by using `-O3` with gfortran 4.7.2 – Yossarian Jul 29 '14 at 09:12
  • just work by segments of, say, 100,000 integers each (logical bits are obviously better, but I don't know Fortran), maintaining separately the changing starting value for the offset 0 in the array. cf. http://stackoverflow.com/a/18154879/849891, http://stackoverflow.com/a/20762980/849891, http://stackoverflow.com/a/19641049/849891. – Will Ness Jul 29 '14 at 11:17

3 Answers3

4

I can see several things you could do to speed the code up although none of them seems likely to explain the sharp drop in performance you experience. The most likely culprit seems to be the RAM constraint as Alexander Vogt suggests.

The first thing you should do is to change prime from integer to logical array. This reduces the memory requirements and also speeds up the evaluations of if (prime(a)==0).

The relevant parts of the code would then look as follows

logical(1),allocatable:: prime(:)

primeCount=0
allocate(prime(max))
prime = .false.
prime(1)=.true.

    do a=2,int(sqrt(real(max))) !the main loop
        if(.not. prime(a))then !if the number is marked as prime procede
            do b=2*a,max,a  !eliminate all the numbers that are multiples of the number
                if(.not. prime(b))then !but only spend time eliminating if the number is still marked prime
                    prime(b)=.true.
                end if
            end do
        end if
    end do

do a=1,max
    if(.not. prime(a))then
        primeCount=primeCount+1
    end if
end do

I don't do any Java programming but in Matlab if you declare prime(1:max)=0 and then only switch values bewteen 0 and 1 I would think that Matlab treats the array as a logical array. Java might be doing the same. This could explain why your Java code does not suffer from the performance degradation (assuming the RAM constraint is really the issue).

EDIT: I have done a few experiments.

On my machine (Debian Linux 64bit, i3, 16GB RAM, ifort 14 with default flags) it took 22 seconds for max=800 million (8E8). It took 60 seconds for max=2E9. This is not hours as reported in the question. Also in each case the prime array happened to be initialized to zero.

Moreover using integer(1) makes the programme run 33% faster then using integer(4). With logical(1) it runs less then 5% faster then with integer(1). This behaviour is probably due to the better use of cash as each element of prime occupies less space in memory therefore the processor can do a larger number of iterations with the data currently in cash going through the loops much faster.

The conclusions I would draw from this would be that the culprit was the lack of RAM as noted by Alexander Vogt and that there is a fair chance the author's experience was not affected by the omission to initialize the prime array (although that definitely should not have happened) as HighPerformanceMark pointed out. Further I suspect that Java declared prime as a logical array and that is why the problem did not occur there. (Although 2^31 in 15 seconds in Java? The Fortran code used here comes nowhere close to this. Was really the same code compared?)

PetrH
  • 700
  • 1
  • 4
  • 12
  • How is change prime from integer to logical array supposed to save memory??? They are the same size in Fortran (and completely same in C). – Vladimir F Героям слава Jul 29 '14 at 07:55
  • @VladimirF OK, you are right, I must have been thinking Matlab here. It should be `logical(1)` then but the point stands I guess. – PetrH Jul 29 '14 at 08:09
  • `logical(1)` is not guarented to exist, but if it exists and happens to be a 1 byte logical it will save some memory. The same holds for `integer(1)`. I would probably use `character`, one is quite sure it is 1 byte and it is portable. – Vladimir F Героям слава Jul 29 '14 at 08:11
  • Out of curiosity how would `character` compare in terms of performance? Ie how costly would `if(.not. prime(a)=='T')` be? – PetrH Jul 29 '14 at 08:24
  • it's the same as 1 byte integer. – Vladimir F Героям слава Jul 29 '14 at 08:31
  • Edited my answer. Added results of a few experiments. – PetrH Jul 29 '14 at 11:42
  • "I suspect that Java declared `prime` as a logical array and that is why the problem did not occur there". Java doesn't have a mechanism to detect such things automatically, but will use whatever data type the author declared it as. If OP used an `int` array, that's 4 bytes per element, but they could have used a `byte` array or `boolean` array, which is only 1 byte per element, or a `BitSet`, which is 8 elements per byte. A Java `BitSet` makes for a very memory-efficient prime sieve! – Boann Jul 29 '14 at 13:48
  • Thank you for your answer!. I suspected it was RAM problems of some sort, but was still rather confused as to why it was so much slower. I'll try using byte and see how it compares. – aPhysicist Jul 29 '14 at 14:26
3

This is more of an extended version of my earlier comment than an answer. I think that your mistake in not setting all elements of prime to 0 is affecting the run time of the code, though perversely my initial assessment is that it is making it faster.

Your original code fails to set to 0 all the elements of primes. It is highly likely that when these statements are first encountered:

do a=2,int(sqrt(real(max))) !the main loop
    if(prime(a)==0)then !if the number is marked as prime procede

the value of prime(2) is not 0 so your algorithm fails to mark 2 and its multiples as prime. And it gets worse ! Unless the elements of prime are initialised to 0 then prime(a) cannot ever be guaranteed to be 0 and your program could run to completion without ever marking a number as prime. I'd expect this mistake to make your code faster since it will not enter the inner loop other than by chance.

Faster perhaps but so broken it's not worth measuring its performance.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • Using `gfortran 4.7.2` on CentOS seems to initialise `prime` to `0`. Using `-finit-integer=1` doesn't change this, which seems to be a bit strange. – Yossarian Jul 29 '14 at 09:11
  • initialization can depend of build flavour (debug/release) – Peter Petrik Jul 29 '14 at 09:39
  • 1
    @Yossarian Read the manual: *"These options do not initialize allocatable arrays"* https://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html Regarding the initial values in an allocatable array, anything can happen. If the run-time system uses just `malloc` no initialization happens. – Vladimir F Героям слава Jul 29 '14 at 09:51
  • Thank yo for your response. I purposely neglected initializing because the array was always set to zero on my machine and always gives the correct answer. I did find that if I changed my integer(4) to integer(1) it gave a tremendous performance boost, but still not as fas as Java! it takes my machine wither integer(1) 45 seconds to find all primes up to 2,000,000,000 which is 3 times longer than java. Any further ideas? – aPhysicist Jul 29 '14 at 14:22
  • You got lucky with the initialisation to `0`, that's not guaranteed by the language standard and is likely not to be portable. It also makes me suspicious that you may not have cranked the optimisation level high enough when you compiled the code. Initialisation of large arrays is the kind of thing compilers don't bother with at the highest optimisation levels but might at lower optimisation levels. – High Performance Mark Jul 29 '14 at 15:03
1

You can store only odd numbers in prime(a) array. That would reduce both size of prime array to half and loop size. Also ensure you are using optimal optimization for your compiler.

Peter Petrik
  • 9,701
  • 5
  • 41
  • 65