Haskell Pattern Matching: Readability and Performance

Question

I'm going through the learn you a haskell tutorial and I've been tripping over some of the examples the author has given.

For example he reimplemented zip as follows:

zip' :: [a] -> [b] -> [(a,b)]  
zip' _ [] = []  
zip' [] _ = []  
zip' (x:xs) (y:ys) = (x,y):zip' xs ys

He uses a similar approach for all his other examples, where he puts the most specific patterns first. Here is a slightly different version of the zip function:

zip' :: [a] -> [b] -> [(a,b)]
zip' (x:xs) (y:ys)  = (x, y):zip' xs ys
zip' _ _            = []

As far as I understand both methods do the same thing. If an empty list is provided either way (x:xs) or (y:ys) won't match which will finish the recursion by appending the empty list [].

I personally prefer the second version for readability, but maybe I'm wrong in doing so.
Does it have any effect on the performance of a method? As far as I understand if the top most pattern does not match, Haskell will check against the next pattern. Does the order of the patterns affect performance?

Kind regards,

Edit:

Possibly duplicate of: Haskell GHC: what is the time complexity of a pattern match with N constructors?

Summary: The order of the patterns is very important for the semantics (in terms of strict evaluation of the arguments) and the readability of a function. The pattern match itself will always be in O(1) time complexity.

Possible duplicate of [Haskell GHC: what is the time complexity of a pattern match with N constructors?](http://stackoverflow.com/questions/9027384/haskell-ghc-what-is-the-time-complexity-of-a-pattern-match-with-n-constructors) — Nima Mousavi, Feb 07 '16 at 14:20

behzad.nouri · Answer 1 · 2016-02-06T19:39:23.550

7

As far as I understand both methods do the same thing.

almost; with an exception:

\> zip' undefined []   -- 1st definition of zip'
[]
\> zip' (filter (< 4) [1..]) [1, 2, 3]
[(1,1),(2,2),(3,3)]

whereas:

\> zip' undefined []   -- 2nd definition of zip'
*** Exception: Prelude.undefined
\> zip' (filter (< 4) [1..]) [1, 2, 3]
[(1,1),(2,2),(3,3)   -- gets stuck here; never returns

in other words, the 2nd definition always forces weak head normal form for both arguments.

Performance-wise, this means that one can construct a pathological example such that WHNF involves heavy computations, therefore one definition performs very differently than the other.

edited Feb 06 '16 at 19:39

answered Feb 06 '16 at 18:53

behzad.nouri

74,723
18
126
124

3

That is true. But `zip' undefined [1, 2, 3]` will produce an exception in both cases. So I can't see how this helps or rather why it would be desirable. – Nima Mousavi Feb 06 '16 at 19:02
1

@Nimi, in this case it only determines *which* argument is unconditionally strict. – dfeuer Feb 07 '16 at 02:19
1

@Nimi the point was that you can make the two functions behave very differently and not do the same thing. note the other example, `zip' (filter (< 4) [1..]) [1, 2, 3]` to see how this can also impact performance. – behzad.nouri Feb 07 '16 at 13:08
@behzad.nouri Ahh.. I can see it now. Thx – Nima Mousavi Feb 07 '16 at 15:24

Haskell Pattern Matching: Readability and Performance

1 Answers1