13

Consider the two following variations:

myReadListTailRecursive :: IO [String]
myReadListTailRecursive = go []
    where 
    go :: [String] -> IO [String]   
    go l = do {
                 inp <- getLine;
                 if (inp == "") then 
                     return l;
                 else go (inp:l);
                }

myReadListOrdinary :: IO [String]
myReadListOrdinary = do
        inp <- getLine
        if inp == "" then
            return []
        else
            do
                moreInps <- myReadListOrdinary
                return (inp:moreInps)

In ordinary programming languages, one would know that the tail recursive variant is a better choice.

However, going through this answer, it is apparent that haskell's implementation of recursion is not similar to that of using the recursion stack repeatedly.

But because in this case the program in question involves actions, and a strict monad, I am not sure if the same reasoning applies. In fact, I think in the IO case, the tail recursive form is indeed better. I am not sure how to correctly reason about this.


EDIT: David Young pointed out that the outermost call here is to (>>=). Even in that case, does one of these styles have an advantage over the other?

  • That is actually not tail recursive. If you desugar the first one, you get `go l = getLine >>= (\inp -> if (inp == "") then return l else go (inp:l))`. The outermost call is to `(>>=)`. – David Young Nov 11 '17 at 04:40
  • @DavidYoung right, I totally forgot about that. In such a case, is there any advantage of one style over the other? – Agnishom Chattopadhyay Nov 11 '17 at 05:15
  • 2
    I'd consider the first one to be at least morally tail recursive. By comparison `f x = bool (x==0) (f (x-1)) 0` is not tail recursive, but `bool` will eventually transfer control to `f (x-1)`, so this runs in constant space. I guess it's the same with IO's `>>=`, except it's less trivial to notice since we need to dig inside the IO implementation. (Also, note that the result of the two posted IO actions is different, since they return the list in a different order) – chi Nov 11 '17 at 08:09
  • 2
    Related: [Under what circumstances are monadic computations tail-recursive?](https://stackoverflow.com/q/13379060/1333025) – Petr Dec 29 '17 at 17:22

2 Answers2

3

FWIW, I'd go for existing monadic combinators and focus on readability/consiseness. Using unfoldM :: Monad m => m (Maybe a) -> m [a]:

import Control.Monad (liftM, mfilter)
import Control.Monad.Loops (unfoldM)

myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM go
  where
    go :: IO (Maybe String)
    go = do
        line <- getLine
        return $ case line of
            "" -> Nothing
            s -> Just s

Or using MonadPlus instance of Maybe, with mfilter :: MonadPlus m => (a -> Bool) -> m a -> m a:

myReadListTailRecursive :: IO [String]
myReadListTailRecursive = unfoldM (liftM (mfilter (/= "") . Just) getLine)

Another, more versatile option, might be to use LoopT.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
Petr
  • 62,528
  • 13
  • 153
  • 317
2

That’s really not how I would write it, but it’s clear enough what you’re doing. (By the way, if you want to be able to efficiently insert arbitrary output from any function in the chain, without using monads, you might try a Data.ByteString.Builder.)

Your first implementation is very similar to a left fold, and your second very similar to a right fold or map. (You might try actually writing them as such!) The second one has several advantages for I/O. One of the most important, for handling input and output, is that it can be interactive.

You’ll notice that the first builds the entire list from the outside in: in order to determine what the first element of the list is, the program needs to compute the entire structure to get to the innermost thunk, which is return l. The program generates the entire data structure first, then starts to process it. That’s useful when you’re reducing a list, because tail-recursive functions and strict left folds are efficient.

With the second, the outermost thunk contains the head and tail of the list, so you can grab the tail, then call the thunk to generate the second list. This can work with infinite lists, and it can produce and return partial results.

Here’s a contrived example: a program that reads in one integer per line and prints the sums so far.

main :: IO ()
main = interact( display . compute 0 . parse . lines )
  where parse :: [String] -> [Int]
        parse [] = []
        parse (x:xs) = (read x):(parse xs)

        compute :: Int -> [Int] -> [Int]
        compute _ [] = []
        compute accum (x:xs) = let accum' = accum + x
                               in accum':(compute accum' xs)

        display = unlines . map show

If you run this interactively, you’ll get something like:

$ 1
1
$ 2
3
$ 3
6
$ 4
10

But you could also write compute tail-recursively, with an accumulating parameter:

main :: IO ()
main = interact( display . compute [] . parse . lines )
  where parse :: [String] -> [Int]
        parse = map read

        compute :: [Int] -> [Int] -> [Int]
        compute xs [] = reverse xs
        compute [] (y:ys) = compute [y] ys
        compute (x:xs) (y:ys) = compute (x+y:x:xs) ys

        display = unlines . map show

This is an artificial example, but strict left folds are a common pattern. If, however, you write either compute or parse with an accumulating parameter, this is what you get when you try to run interactively, and hit EOF (control-D on Unix, control-Z on Windows) after the number 4:

$ 1
$ 2
$ 3
$ 4
1
3
6
10

This left-folded version needs to compute the entire data structure before it can read any of it. That can’t ever work on an infinite list (When would you reach the base case? How would you even reverse an infinite list if you did?) and an application that can’t respond to user input until it quits is a deal-breaker.

On the other hand, the tail-recursive version can be strict in its accumulating parameter, and will run more efficiently, especially when it’s not being consumed immediately. It doesn’t need to keep any thunks or context around other than its parameters, and it can even re-use the same stack frame. A strict accumulating function, such as Data.List.foldl', is a great choice whenver you’re reducing a list to a value, not building an eagerly-evaluated list of output. Functions such as sum, product or any can’t return any useful intermediate value. They inherently have to finish the computation first, then return the final result.

Davislor
  • 14,674
  • 2
  • 34
  • 49