10

Haskell newb here

I'm working on this problem in haskell:

(**) Eliminate consecutive duplicates of list elements.
If a list contains repeated elements they should be replaced with a single copy of the element. The order of the elements should not be changed.

Example:
* (compress '(a a a a b c c a a d e e e e))
(A B C A D E)

The solution (which I had to look up) uses foldr:

compress' :: (Eq a) => [a] -> [a]
compress' xs = foldr (\x acc -> if x == (head acc) then acc else x:acc) [last xs] xs

This foldr, according to the solution, takes two parameters, x and acc. It would seem like all foldr's take these parameters; is there any exception to this? Like a foldr that takes 3 or more? If not, is this convention redundant and can the formula be written with less code?

Mark Karavan
  • 2,654
  • 1
  • 18
  • 38
  • 3
    You can conceive of many instances where foldr takes more than 2 arguments, for example, `foldr (\f g x -> f (g x)) (\x -> x)`. – user2407038 Jan 12 '15 at 16:53
  • I'm not sure I would use `foldr` to implement your `compress` function. What should happen if the argument is the empty list? – jub0bs Jan 12 '15 at 17:02
  • The ``foldr`` implementation would be fine you would just need to have a pattern first which can deal with the case of an empty list ``compress' [] = []`` – Simon Gibbons Jan 12 '15 at 17:09
  • @SimonGibbons, another approach that works with any `Foldable` and that plays well with list fusion is to use a `Maybe` for the accumulator. – dfeuer Jun 01 '16 at 19:03

3 Answers3

19

foldr takes a function of 2 arguments, but this doesn't prevent it from taking a function of 3 arguments provided that function has the right type signature.

If we had a function

g :: x -> y -> z -> w

With

foldr :: (a -> b -> b) -> b -> [a] -> b

Where we want to pass g to foldr, then (a -> b -> b) ~ (x -> y -> z -> w) (where ~ is type equality). Since -> is right associative, this means we can write g's signature as

x -> y -> (z -> w)

and its meaning is the same. Now we've produced a function of two parameters that returns a function of one parameter. In order to unify this with the type a -> b -> b, we just need to line up the arguments:

a ->   |  x ->
b ->   |  y -> 
b      |  (z -> w)

This means that b ~ z -> w, so y ~ b ~ z -> w and a ~ x so g's type really has to be

g :: x -> (z -> w) -> (z -> w)

implying

foldr g :: (z -> w) -> [x] -> (z -> w)

This is certainly not impossible, although more unlikely. Our accumulator is a function instead, and to me this begs to be demonstrated with DiffLists:

type DiffList a = [a] -> [a]

append :: a -> DiffList a -> DiffList a
append x dl = \xs -> dl xs ++ [x]

reverse' :: [a] -> [a]
reverse' xs = foldr append (const []) xs $ []

Note that foldr append (const []) xs returns a function which we apply to [] to reverse a list. In this case we've given an alias to functions of the type [a] -> [a] called DiffList, but it's really no different than having written

append :: a -> ([a] -> [a]) -> [a] -> [a]

which is a function of 3 arguments.

bheklilr
  • 53,530
  • 6
  • 107
  • 163
5

As with all things in haskell have a look at the types of things to guide your way you can do this for any function in ghci.

Looking at this for foldr we see:

Prelude> :t foldr
foldr :: (a -> b -> b) -> b -> [a] -> b

This slightly abstract string can be written in english as:

foldr is a function that takes

1 ) a function with two parameters one of type a and one of type b and returns something of type b

2 ) A value of type b

3 ) A list of values of type a

And returns a value of type b

Where a and b are type variables (see here for a good tutorial on them) which can be filled in with any type you like.

Simon Gibbons
  • 6,969
  • 1
  • 21
  • 34
4

It turns out that you can solve your compress problem using a foldr with a three-argument function.

compress :: Eq a => [a] -> [a]
compress [] = []
compress (z:zs) = z : foldr f (const []) zs z
  where f x k w | x==w      = k x
                | otherwise = x : k x

Let's dissect that. First, we can improve readability by changing the last two lines to

  where f x k = \w -> if x==w then k x else x : k x

This makes it evident that a ternary function is nothing but a binary function returning a unary function. The advantage of looking at it in this way is that foldr is best understood when passed a binary function. Indeed, we are passing a binary function, which just happens to return a function.

Let's focus on types now:

f :: a -> (a -> [a]) -> (a -> [a])
f    x    k

So, x::a is the element of the list we are folding on. Function k is the result of the fold on the list tail. The result of f x k is something having the same type as k.

\w -> if ....    :: (a -> [a])

The overall idea behind this anonymous function is as follows. The parameter k plays the same role as acc in the OP code, except it is a function expecting the previous element w in the list before producing the accumulated compressed list.

Concretely, we use now k x when we used acc, passing on the current element to the next step, since by that time x will become the previous element w. At the top-level, we pass z to the function which is returned by foldr f (const []).

This compress variant is lazy, unlike the posted solution. In fact, the posted solution needs to scan the whole list before starting producing something: this is due to (\x acc -> ...) being strict in acc, and to the use of last xs. Instead, the above compress outputs list elements in a "streaming" fashion. Indeed, it works with infinite lists as well:

> take 10 $ compress [1..]
[1,2,3,4,5,6,7,8,9,10]

That being said, I think using a foldr here feels a bit weird: the code above is arguably less readable than the explicit recursion.

chi
  • 111,837
  • 3
  • 133
  • 218
  • Thanks, this is helping me understand this answer, http://stackoverflow.com/questions/37526740/why-is-the-f-version-of-this-program-6x-faster-than-the-haskell-one/37527523 . But by the way, changing the last two lines as you suggested to `where f x k = \w -> if x==w then k w else x : k w` doesn't seem to work. I get this: `*Main> compress [2,3,3] => [2,3,3]` – גלעד ברקן Jun 01 '16 at 16:56
  • Could you please help me understand how the return value of `f x k w` is also a function - if you say that the accumulator is a function, doesn't that mean that it stays a function throughout the fold? If, in your example, `k` is a function that takes one argument, then `k x` and `x : k x` already include the argument, and since as I understand, the accumulator each time becomes the return value of `f`, doesn't it then become `k x` (or `x : k x`), which is just a list, rather than a function? – גלעד ברקן Jun 01 '16 at 17:40
  • I wrote `k w` but I should have written `k x` -- edited. – chi Jun 01 '16 at 17:42
  • `f x k` is a function, `f x k w` is a list. The first function `f x k` indeed takes `w` as argument, and returns a list. `k` is a similar function. – chi Jun 01 '16 at 17:43
  • is `k x` a function? If the accumulator takes on that value, `k x`, how does the accumulator stay a function? – גלעד ברקן Jun 01 '16 at 17:44
  • 1
    No, `k x` is a list. `k` is a function. The accumulator stays a function because it changes from `k` (a function) to `\w -> if ...` (another function, which is also `f x k`) – chi Jun 01 '16 at 19:11
  • Btw, is it possible to adapt `foldM` to this 3-parameter technique? – AleXoundOS Jun 30 '19 at 12:39
  • 1
    @AleXoundOS I don't think so, since the result of `foldM` is in a monad, so returning `m (a->b)` won't let us pass the third argument. We could transform that into `a -> m b`, but in that case the monadic effect can not depend on `a`, so it's not completely general. – chi Jun 30 '19 at 13:34
  • I could only satisfy compiler with `f :: (c -> IO [b]) -> a -> IO (c -> IO [b])` for `foldM` and make no use of it. In simple terms, I need that 2nd `c` to be accessible as an argument, but it's inside my output context. Wondered if it's still possible to avoid tuples for a temporary accumulator, so to speak. @chi, thank you for the explanation. – AleXoundOS Jun 30 '19 at 15:17