8

I was taking a look at some prelude functions to teach a workmate about recursion and I found that some are written in rather a weird way. Example:

{-# NOINLINE [1] zipWith #-}
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f = go
  where
    go [] _ = []
    go _ [] = []
    go (x:xs) (y:ys) = f x y : go xs ys

Why is it written as a call to go where go is defined right afterwards? IMO a more natural way to define it is:

zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith _ [] _ = []
zipWith _ _ [] = [] 
zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys

I guess it is due to some inline / loop-fusion / lazy / whatever-feature that allow GHC to optimize better, but this is the point in which Haskell becomes quite obscure to me. Can anyone explain (as easier as possible) the reason for such a function definition, if there is any.

Edits

Many comments, @AndrewRay's answer and in this question, point in the direction of inlining as the reason for such an auxiliar local function. Nevertheless, zipWith is marked with the pragma NOINLINE [1] zipWith, which, up to GHC's user guide: 7.13.5.5. Phase control, means to not inline before first phase and maybe inline afterwards (for whatever this means). In the linked question, the PO refers to foldr which is implemented using the same trick but without any PRAGMA. I'd argue that the author is avoiding inlining but my knowledge in this is small.

Edit 2

added link to source definition of zipWith

lsmor
  • 4,698
  • 17
  • 38
  • 3
    To avoid passing the `f` each time. – Willem Van Onsem Nov 12 '19 at 11:44
  • @WillemVanOnsem That's a quick answer!. Can you elaborate a little bit more? It doesn't seem obvious since `go` is actually passed, and `go` contains `f` in its definition.... isn't it? – lsmor Nov 12 '19 at 11:47
  • "More natural" is in the eye of the beholder. To me `go` is very natural. It zips its two list arguments. Zips with what? With something it knows about. That "something" is irrelevant to the rcursive machinery of zipping. – n. m. could be an AI Nov 12 '19 at 11:57
  • 2
    @n.'pronouns'm. There is nothing subjective to Occam's razor. `go` is a new entity, and there is no concept underlying it — they did not even come up with a real name. So it is an elaboration, and it is, denotationally, extraneous — _unnatural_. – Ignat Insarov Nov 12 '19 at 13:23
  • 1
    You don't even have the most readable version of `go`. Define `go (x:xs) (y:ys)` first, then the other two base cases reduce to the single case `go _ _ = []`. – chepner Nov 12 '19 at 14:07
  • @IgnatInsarov `go` is a closure over `f`; I'd say that's pretty conceptual. – chepner Nov 12 '19 at 14:09
  • @chepner Closure is [such a wide term](https://stackoverflow.com/a/33307944) that your sentence substantiates `go` no more than if you said _"`go` is a function"_. Besides, _"closure"_ is quite an extraneous concept itself — an implementation detail! Then again, I do not think people use this concept of closure in the context of Haskell programming so often. So, if `go` is nothing but a closure, it hardly deserves a name, and your argument fortifies mine: the definition without `go` is indeed objectively more natural. – Ignat Insarov Nov 12 '19 at 14:56
  • @IgnatInsarov Fine, pick whatever precisely defined term you want. – chepner Nov 12 '19 at 15:00
  • @chepner I did not mean to sound abrasive. My point though is exactly that no precise definition for the role of `go` in `zipWith` has come up so far, and moreover that it is not clear whether it needs to be there at all. – Ignat Insarov Nov 12 '19 at 15:15
  • @IgnatInsarov I always say, be careful with sharp tools, you can cut yourself. – n. m. could be an AI Nov 12 '19 at 15:19
  • @n.'pronouns'm. Occam's razor does not cut humans, only ideas. – Ignat Insarov Nov 12 '19 at 15:32
  • A `NOINLINE[1]` pragma doesn't mean "don't inline at all." It means "don't inline until the first optimization round stabilizes." This is important for making sure it has an opportunity to participate in `foldr`/`build` fusion before simple inlining is tried. `foldr` itself not having the inlining delay suggests something about the order in which optimizations are applied, but I'm not totally sure what. – Carl Nov 12 '19 at 16:41

1 Answers1

5

As Willem Van Onsem mentioned in his comment, this means that f doesn't need to be passed in each recursive call.

Your version of the function uses the recursive call zipWith f xs ys. Because f is an argument of zipWith, it has to get passed repeatedly. This means that it isn't obvious from looking at zipWith that f never changes from one level of recursion to the next.

The Prelude version uses the recursive call go xs ys, which immediately signals that every recursive call to go makes use of the same f. Being able to immediately notice invariants like this helps us reason logically about our code.

EDIT: I had previously believed that inlining wasn't relevant here, but Carl and Daniel Wagner both point out that the thunk f only needs to be evaluated once and substituted in the definition of go, rather than being reevaluated for every recursion as would be the case in your definition of zipWith.

You also mention loop-fusion, which is the process of merging adjacent traversal functions into a single traversal. Since this is already a single traversal, that doesn't apply here either.

A. R.
  • 2,031
  • 13
  • 27
  • Is `zipWith` with `go` self-recursive though? – Ignat Insarov Nov 12 '19 at 14:58
  • 1
    It's 100% about performance of compiled code. Making the top-level definition non-recursive allows it to be inlined, which results in `go` being defined someplace where `f` is in scope, which means that if `f` is known it will get inlined into the recursive part. This reduces the use of indirect function calls for potentially significant speedups. – Carl Nov 12 '19 at 15:20
  • You say it's not about performance, but it actually is. If you make a closure as in GHC's Prelude, you copy the pointer to `f`'s thunk once, then look it up in the closure once per list element you produce. If you don't, as in the proposed alternative implementation, you copy the pointer to `f`'s thunk once per list element you produce, plus then still have to look it up in the newly created closure. GHC is smart enough to *sometimes* eliminate this latter cost, so these days this technique is only faster when multiple arguments are read-only, but it wasn't always so. – Daniel Wagner Nov 12 '19 at 15:27
  • @Carl, that's a fair point; I didn't consider `zipWith` itself being inlined. – A. R. Nov 12 '19 at 15:28
  • @DanielWagner, I had thought that that was optimized away in any case, but I'll defer to your expertise on that one. – A. R. Nov 12 '19 at 15:31
  • Hi @AndrewRay. Thanks for answering, but `zipWith` is marked with `NOINLINE [1] zipWith` pragma, which makes me hesitate about your point. Check for the edit if you will – lsmor Nov 12 '19 at 16:04
  • @Ismor, that indicates that `zipWith` should not be inlined itself. The point about inlining `f` in the definition of `go` still stands unless I'm misunderstanding the pragma statement. – A. R. Nov 12 '19 at 16:06
  • @AndrewRay In the question linked in the edit, the accepted answer says that `f` is inlined in the definition of `go` when `foldr` is inlined. I am not trying to argue against you and you're probably right, but this is somehow contradictory information. (@Carl's comment actually is about inlining `zipWith` so `f` can be inlined too) – lsmor Nov 12 '19 at 16:28