How do I define the sieve function for prime computation using higher–order functions?

Question

I have a recursive definition of sieve in Haskell for prime number computation. But I don’t know how to write the same function using higher–order functions such as map or filter. Can anybody show me please?

sieve [] = []
sieve (x:xs) = check (x:xs)

check [] = []
check (x:xs)
        |x/=2 && x/=3 && x/=5 && x/=7 = comp (x:xs)
        |otherwise    = x : sieve xs

comp [] = []
comp (x:xs)
        |x `mod` 2 == 0 = sieve xs
        |x `mod` 3 == 0 = sieve xs
        |x `mod` 5 == 0 = sieve xs
        |x `mod` 7 == 0 = sieve xs
        |otherwise      = x : sieve xs

Relevant read: http://www.haskell.org/haskellwiki/Prime_numbers#Sieve_of_Eratosthenes — reto, Dec 05 '13 at 07:16

Will Ness · Answer 1 · 2021-12-04T18:24:42.027

1

With map and filter and iterate; very slow:

primes = map head $ iterate (\(x:xs) -> filter ((> 0).(`rem`x)) xs) [2..]

with addition of concat; much faster and with much improved complexity:

primes = concat . map fst $ 
   iterate (\(_, (p:ps, xs)) -> case span (< p*p) xs of
                   { (h,t) -> (h, (ps, filter ((> 0).(`rem`p)) t)) }) 
           ([2], (primes, [3..]))

more at Haskell wiki.

You can express iterate through map if you prefer:

iterate f x = let { r = x : map f r } in r

and filter too:

filter f xs = concat $ map (\x -> [x | f x]) xs

But for the true sieve of Eratosthenes, - one which does not detect composites by divisibility testing but generates them directly from primes it finds, and finds the primes in gaps between thus generated composites, - we need some more auxiliary functions, like minus and union, and treelike-folding foldi (foldr can be used in place of foldi but with decreased speed and worsened complexity):

primes = 2 : _Y ((3:) . minus [5,7..]     
                         . foldi (\(x:xs) ys -> x : union xs ys) [] 
                            . map (\p -> [p*p, p*p+2*p..]))
_Y g = g (_Y g)

This runs yet faster, at close to best empirical orders of growth achievable with Haskell's immutable code. Immutable arrays can be faster, but they are excluded here because a. it's not in the question, and b. their performance is determined by a given Haskell implementation, not a user code. Mutable arrays are of course the fastest but the code is even more involved.

edited Dec 04 '21 at 18:24

answered Dec 06 '13 at 09:45

Will Ness

70,110
9
98
181

It's still much slower than the mutable version though. – haskelline Dec 07 '13 at 17:46
@haskelline which one is it? -- have you compared [empirical orders of growth](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth)? -- anyway the question was about how to write it with higher order functions, not about how to write the fastest prime code. We have arithmoi package for that, and it won't fit in an SO answer. – Will Ness Dec 07 '13 at 18:41
On my machine the plain simple C version is 4.5x faster for getting the largest prime `< 10000000`. – haskelline Dec 07 '13 at 19:37
@haskelline great! :) -- wait, faster than *what*? -- and again, Was this a question about comparing C and Haskell? :) Have you measured the empirical orders of growth of your C program BTW? – Will Ness Dec 07 '13 at 19:39
Empirical orders of growth are totally irrelevant here. I'm talking about the last version in your answer. You were the one who said this is the *true* sieve of eratosthenes algorithm. My argument is, no, there's a huge constant factor between this and the plain simple sieve of eratosthenes. – haskelline Dec 07 '13 at 20:16
@haskelline it's *never* irrelevant: "constant factor", means *same* algorithmic nature of the two codes, means "true" sieve - assuming your sieve is "true" sieve (most likely). "trueness" is not measured by raw speed, but by asymptotic behaviour. cf. http://ideone.com/fapob - it shows C++ code 16x faster than this kind of code (Ojq01 entry), with very close orders of growth --> good enough for me. :) – Will Ness Dec 07 '13 at 22:36
@WillNess haskelline is correct. As another example, the lovely Haskell quicksort has the same asymptotic complexity as the imperative in-place quicksort. However, it's not considered the "true" quicksort because the true quicksort is supposed to be much more efficient. – is7s Dec 07 '13 at 22:39
@is7s I disagree. This code has best achievable empirical complexity for immutable code in Haskell - same as O'Neill's. Both clock at n^1.2, vs. n^1.1 for C++ baseline. Extra 0.1 obviously how extra log n factor shows itself, totally expectable for the immutable code. – Will Ness Dec 07 '13 at 22:46
@is7s the lovely Haskell quicksort is deforested treesort, which so happens to have same average case complexity as the true quick sort. Doesn't disprove anything; nature well understood. And, it's not "true" not because of its speed, but because Tony Hoare described his algorithm as in-place partition. -- Here, "true" sieve means it is not trial division, the composites are not detected, but generated. – Will Ness Dec 07 '13 at 22:49
@WillNess Well, any non in-place quicksort which pivots the first element of the list is a deforested treesort and it has the same *worst-case* big-O complexity as the in-place counterpart. This doesn't change anything. That said, the algorithm you presented above is not efficient enough to be used in production. However, the simplest form of the imperative sieve of eratosthenes is efficient enough. – is7s Dec 07 '13 at 22:55
correction: The 2nd comment above should read "best achievable immutable" ***list-based*** code. Arrays are faster. – Will Ness Dec 07 '13 at 22:55
@is7s Did I ever claim that it was (fast enough for production)? The question was about using `map`; well, I did. -- I don't think the simplest imperative sieve in Haskell is fast enough. – Will Ness Dec 07 '13 at 22:56
@WillNess saying that the above presented code is the *true* sieve implicitly implies this :) – is7s Dec 07 '13 at 22:58
@is7s no Sir, it doesn't. In my opinion. :) I explained my reasoning already. :) – Will Ness Dec 07 '13 at 22:58
@WillNess The simplest imperative sieve in Haskell is very efficient. I remember benchmarking it against C few years back. The results were really impressive. – is7s Dec 07 '13 at 23:00
@is7s http://ideone.com/fapob. 18 mln primes, C++: 2.89s; simple Haskell STUArray code: 13.65s. Daniel's arithmoi - a very sophisticated code - is much faster I believe. – Will Ness Dec 07 '13 at 23:01
@WillNess I really don't trust that implementation. There has to be serious issues in the code. The haskell code is not shown btw. – is7s Dec 07 '13 at 23:03
@is7s what implementation? you mean STUArrays?... A standard Haskell code. you can find each under its entry: KwZNc for arrays. I see it's cleared out; Ideone glitches. :((( But, it's just the code from haskellwiki page. – Will Ness Dec 07 '13 at 23:04
@WillNess It's not there :) As I said, I already benchmarked against C before and the difference was few milliseconds. Why bother with some code which I suspect has serious problems (from the results shown)? – is7s Dec 07 '13 at 23:08
@is7s it's on http://www.haskell.org/haskellwiki/Prime_numbers#Using_ST_Array -- it depends, what C did you write (with all due respect...:) ). Here in C++ vector is automatically bit-packed. Did you use bitarrays? Was it efficiently coded?.. all kind of factors. C is not automatically fast (pardon the trivialisms). – Will Ness Dec 07 '13 at 23:09
@is7s here it is: http://web.archive.org/web/20120507080459/http://ideone.com/KwZNc – Will Ness Dec 07 '13 at 23:14
@WillNess I was comparing the exact same code written in both languages. If both are not exactly direct-translations of each other, or one implementation has optimizations (such as bit-arrays!) which the other implementation does not have, then the comparison is quite useless! – is7s Dec 07 '13 at 23:17
@is7s but I was told - by Daniel Fischer - that `UArray Int Bool` is also bit-packed! -- I've restored that entry on Ideone BTW... Both codes have exactly same addressing calculations, to work on odds. -- I've seen some code where Daniel replaced the forM_ loops with explicit recursion and unsafe reads and writes; it ran 2x faster; so it'll be 8x ratio instead of 16x. – Will Ness Dec 07 '13 at 23:18
@is7s well, maybe if all the calculations are moved into the ST monad, it will run even faster... will need to check that later. – Will Ness Dec 07 '13 at 23:26
@WillNess still I think the comparison isn't fair. C++ gives you fine-grained control over the memory allocation, while in Haskell you depend on whoever implemented the `array` library (which is most probably well-done, however might not be as flexible as manual memory allocation). That said, the C++ compiler also does really good job optimizing that low-level bit-stuff, which GHC is weak at. Comparing the non bit-packed implementation is fair. – is7s Dec 07 '13 at 23:28
@WillNess also `-fllvm` might help a lot here. Building the `array` library with `-fllvm` should also make a big difference. – is7s Dec 07 '13 at 23:30
@is7s I'll try just returning 5 top primes instead of whole array from the ST code, this should mae it really faster; but I don't have time right now... Also, it's not my pet peeve and - because - I know arithmoi is faster... :) Even if we bring it to be on par with C++, it still won't disprove me - IMHO - on the list-based code. :) – Will Ness Dec 07 '13 at 23:35
@is7s BTW 16x was referring to the list version; STUArray-based is **4.7x slower** than its C++ counterpart. Switching to hand-written loops w/ unsafe access should give it 2x-4x speedup. But hand-written loops w/ unsafe access are even farther away from "the simplest imperative sieve"'s *most natural `forM_`-based loops* as used there. That code *is* "the simplest imperative sieve"* you talked about, you dismiss it without basis AFAI can tell. -- About memory: *exact* same thing is used in both variants: one contiguous array of odds, bit-packed. So this addresses all your remarks I guess. :) – Will Ness Dec 08 '13 at 12:59

jamshidh · Answer 2 · 2013-12-05T08:31:52.603

0

I threw this together quickly, the speed isn't that great, but it is really easy to implement.

primes'::[Int]->[Int]
primes' [] = []
primes' (x:xs) = x:primes (filter ((/= 0) . (`mod` x)) xs)

main = print $ primes [2..20] -- always input a contiguous list from 2 to N.

edited Dec 05 '13 at 08:31

answered Dec 05 '13 at 07:44

jamshidh

12,002
17
31

3

That's not the sieve of Eratosthenes, that's trial division. See http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf – Carl Dec 05 '13 at 07:45
@Carl- Oops, I just read the paper and you are right.... I will fix the wording. – jamshidh Dec 05 '13 at 08:31

How do I define the sieve function for prime computation using higher–order functions?

2 Answers2

Linked