7

I'm trying to convert basic functions into higher order functions (specifically map, filter, or foldr). I was wondering if there are any simple concepts to apply where I could see old functions I've written using guards and turn them into higher order.

I'm working on changing a function called filterFirst that removes the first element from the list (second argument) that does not satisfy a given predicate function (first argument).

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst _ [] = []
filterFirst x (y:ys)
    | x y       = y : filterFirst x ys
    | otherwise = ys

For an example:

 greaterOne :: Num a=>Ord a=>a->Bool
 greaterOne x = x > 1

 filterFirst greaterOne [5,-6,-7,9,10]
 [5,-7,9,10]

Based on the basic recursion, I was wondering if there might be a way to translate this (and similar functions) to higher order map, filter, or foldr. I'm not very advanced and these functions are new to me.

duplode
  • 33,731
  • 7
  • 79
  • 150
  • 4
    Sounds like [`deleteBy`](https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:deleteBy): `deleteBy (>) 1 [5,-6,-7,9,10]` – melpomene Apr 29 '19 at 21:03
  • Is the question, "can I express `filterFirst` in terms of `map`, `filter`, and/or `foldr`?"? – moonGoose Apr 29 '19 at 21:20
  • Yes. And once it's changed, I want to use it to help me understand replacing older, more basic functions with higher order functions. If someone could show me, or explain an easier way to translate it, I would be appreciative. I'm just learning how to use them. –  Apr 29 '19 at 21:24
  • You can write `filterFirst` (and essentially anything else) with `foldr`, but it will be rather inelegant. Using `mapAccumL` might provide a better result, but IMO explicit recursion is the best here. – chi Apr 29 '19 at 21:37
  • 1
    @melpomene `deleteBy (>=)`, I think... this illustrates why passing non-symmetric functions to `deleteBy` is something I'd be cautious of. – user11228628 Apr 29 '19 at 22:13
  • 1
    @chi I think mapAccumL is unsuited because it is length-preserving (ie. you'd have to map it to something like `Maybe a` then take the `Just`s). Something like `concatMapAccum` from Data.Conduit.List could do it if it had a non-conduit analog (though that is more general, can send 1 element to any number of results). – moonGoose Apr 29 '19 at 22:46
  • @melpomene `deleteBy` would fit like a glove weren't it for its peculiar equivalence-expecting signature, which clashes with the general problem statement here. There actually have been calls to change `deleteBy` to `(a -> Bool) -> [a] -> [a]`... but then, negated predicate aside, you end up with `filterFirst`, exactly as the OP presented it here. – duplode May 03 '19 at 10:51

6 Answers6

4

There is a higher-order function that's appropriate here, but it's not in the base library. What's the trouble with foldr? If you just fold over the list, you'll end up rebuilding the whole thing, including the part after the deletion.

A more appropriate function for the job is para from the recursion-schemes package (I've renamed one of the type variables):

para :: Recursive t => (Base t (t, r) -> r) -> t -> r

In the case of lists, this specializes to

para :: (ListF a ([a], r) -> r) -> [a] -> r

where

data ListF a b = Nil | Cons a b
  deriving (Functor, ....)

This is pretty similar to foldr. The recursion-schemes equivalent of foldr is

cata :: Recursive t => (Base t r -> r) -> t -> r

Which specializes to

cata :: (ListF a r -> r) -> [a] -> r

Take a break here and figure out why the type of cata is basically equivalent to that of foldr.


The difference between cata and para is that para passes the folding function not only the result of folding over the tail of the list, but also the tail of the list itself. That gives us an easy and efficient way to produce the rest of the list after we've found the first non-matching element:

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst f = para go
  where
    --go :: ListF a ([a], [a]) -> [a]
    go (Cons a (tl, r))
      | f a = a : r
      | otherwise = tl
    go Nil = []

para is a bit awkward for lists, since it's designed to fit into a more general context. But just as cata and foldr are basically equivalent, we could write a slightly less awkward function specifically for lists.

foldrWithTails
  :: (a -> [a] -> b -> b)
  -> b -> [a] -> b
foldrWithTails f n = go
  where
    go (a : as) = f a as (go as)
    go [] = n

Then

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst f = foldrWithTails go []
  where
    go a tl r
      | f a = a : r
      | otherwise = tl
dfeuer
  • 48,079
  • 5
  • 63
  • 167
3

First, let's flip the argument order of your function. This will make a few steps easier, and we can flip it back when we're done. (I'll call the flipped version filterFirst'.)

filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x
    | x y       = y : filterFirst' ys x
    | otherwise = ys

Note that filterFirst' ys (const True) = ys for all ys. Let's substitute that in place:

filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x
    | x y       = y : filterFirst' ys x
    | otherwise = filterFirst' ys (const True)

Use if-else instead of a guard:

filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] _ = []
filterFirst' (y:ys) x = if x y then y : filterFirst' ys x else filterFirst' ys (const True)

Move the second argument to a lambda:

filterFirst' :: [a] -> (a -> Bool) -> [a]
filterFirst' [] = \_ -> []
filterFirst' (y:ys) = \x -> if x y then y : filterFirst' ys x else filterFirst' ys (const True)

And now this is something we can turn into a foldr. The pattern we were going for is that filterFirst' (y:ys) can be expressed in terms of filterFirst' ys, without using ys otherwise, and we're now there.

filterFirst' :: Foldable t => t a -> (a -> Bool) -> [a]
filterFirst' = foldr (\y f -> \x -> if x y then y : f x else f (const True)) (\_ -> [])

Now we just need to neaten it up a bit:

filterFirst' :: Foldable t => t a -> (a -> Bool) -> [a]
filterFirst' = foldr go (const [])
  where go y f x
          | x y       = y : f x
          | otherwise = f (const True)

And flip the arguments back:

filterFirst :: Foldable t => (a -> Bool) -> t a -> [a]
filterFirst = flip $ foldr go (const [])
  where go y f x
          | x y       = y : f x
          | otherwise = f (const True)

And we're done. filterFirst implemented in terms of foldr.


Addendum: Although filter isn't strong enough to build this, filterM is when used with the State monad:

{-# LANGUAGE FlexibleContexts #-}

import Control.Monad.State

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst x ys = evalState (filterM go ys) False
  where go y = do
          alreadyDropped <- get
          if alreadyDropped || x y then
            return True
          else do
            put True
            return False
2

If we really want, we can write filterFirst using foldr, since foldr is kind of "universal" -- it allows any list transformation we can perform using recursion. The main downside is that the resulting code is rather counter-intuitive. In my opinion, explicit recursion is far better in this case.

Anyway here's how it is done. This relies on what I consider to be an antipattern, namely "passing four arguments to foldr". I call this an antipattern since foldr is usually called with three arguments only, and the result is not a function taking a fourth argument.

filterFirst :: (a->Bool)->[a]->[a]
filterFirst p xs = foldr go (\_ -> []) xs True
   where
   go y ys True 
      | p y = y : ys True 
      | otherwise = ys False
   go y ys False = y : ys False

Clear? Not very much. The trick here is to exploit foldr to build a function Bool -> [a] which returns the original list if called with False, and the filtered-first list if called with True. If we craft that function using

foldr go baseCase xs

the result is then obviously

foldr go baseCase xs True

Now, the base case must handle the empty list, and in such case we must return a function returning the empty list, whatever the boolean argument is. Hence, we arrive at

foldr go (\_ -> []) xs True

Now, we need to define go. This takes as arguments:

  1. a list element y
  2. the result of the "recursion" ys (a function Bool->[a] for the rest of the list)

and must return a function Bool->[a] for the larger list. So let's also consider

  1. a boolean argument

and finally make go return a list. Well, if the boolean is False we must return the list unchanged, so

go y ys False = y : ys False

Note that ys False means "the tail unchanged", so we are really rebuilding the whole list unchanged.

If instead the boolean is true, we query the predicate as in p y. If that is false, we discard y, and return the list tail unchanged

   go y ys True 
      | p y = -- TODO
      | otherwise = ys False

If p y is true, we keep y and we return the list tail filtered.

   go y ys True 
      | p y = y : ys True
      | otherwise = ys False

As a final note, we cold have used a pair ([a], [a]) instead of a function Bool -> [a], but that approach does not generalize as well to more complex cases.

So, that's all. This technique is something nice to know, but I do not recommend it in real code which is meant to be understood by others.

chi
  • 111,837
  • 3
  • 133
  • 218
  • 2
    IMO, the main downside of a fold here is the loss of sharing. This is where paramorphisms come in. – dfeuer Apr 30 '19 at 16:45
  • (See my answer, and please feel free to edit it if you think you can help make it more accessible to Haskell beginners.) – dfeuer Apr 30 '19 at 17:31
  • 2
    An aside: I absolutely wouldn't consider higher-order use of `foldr` an anti-pattern. In some cases (not this one) it actually does produce clear code. In some cases (particularly when writing high-performance library code) it's necessary to take advantage of GHC's list fusion optimizations. And it's fairly often necessary when generalizing list-consuming functions to `Foldable`-consuming ones. – dfeuer Apr 30 '19 at 17:35
  • I find one aspect of this solution quite fun: In a foldr-with-four-arguments implementation, typically the fold operation either pre-composes functions (as in the foldl-from-foldr trick) or post-composes them (as in the dlist trick). Here it is necessary to do both, as we want to test the elements left-to-right without building a reversed list. That becomes plainer to the eye if we golf `go` into `go y ys = (bool id (y :) <*> ys) . (<= p y)` . – duplode Apr 30 '19 at 23:25
  • @dfeuer have you changed your [recent opinion](https://stackoverflow.com/questions/55522847/implementing-haskells-take-function-using-foldl/55584633#comment97756176_55523947) or have I misunderstood it? what specifically had you objected to, there? what's your opinion about the sharing, paramorphism-equivalent `filterFirst p xs = foldr go id xs xs where go x r ~(_:xs) | p x = x : r xs | otherwise = xs`? would it be good, fusion-wise? (meta clarification: I'm not being argumentative here; I want to understand your position). thanks. – Will Ness May 04 '19 at 17:40
  • (a variant: `... where go _ r (x:xs) | p x = ...`; might be clearer (?) ) – Will Ness May 04 '19 at 17:48
  • @WillNess, I don't see any contradiction. What are you getting at? And I see no advantage whatsoever to using `foldr` if you're going to walk the list by pattern matching in parallel. – dfeuer May 04 '19 at 21:31
2

Joseph and chi's answers already show how to derive a foldr implementation, so I'll try to aid intuition.

map is length-preserving, filterFirst is not, so trivially map must be unsuited for implementing filterFirst.

filter (and indeed map) are memoryless - the same predicate/function is applied to each element of the list, regardless of the result on other elements. In filterFirst, behaviour changes once we see the first non-satisfactory element and remove it, so filter (and map) are unsuited.

foldr is used to reduce a structure to a summary value. It's very general, and it might not be immediately obvious without experience what sorts of things this may cover. filterFirst is in fact such an operation, though. The intuition is something like, "can we build it in a single pass through the structure, building it up as we go(, with additional state stored as required)?". I fear Joseph's answer obfuscates a little, as foldr with 4 parameters, it may not be immediately obvious what's going on, so let's try it a little differently.

filterFirst p xs = snd $ foldr (\a (deleted,acc) -> if not deleted && not (p a) then (True,acc) else (deleted,a:acc) ) (False,[]) xs

Here's a first attempt. The "extra state" here is obviously the bool indicating whether or not we've deleted an element yet, and the list accumulates in the second element of the tuple. At the end we call snd to obtain just the list. This implementation has the problem, however, that we delete the rightmost element not satisfying the predicate, because foldr first combines the rightmost element with the neutral element, then the second-rightmost, and so on.

filterFirst p xs = snd $ foldl (\(deleted,acc) a -> if not deleted && not (p a) then (True,acc) else (deleted,a:acc) ) (False,[]) xs

Here, we try using foldl. This does delete the leftmost non-satisfactory element, but has the side-effect of reversing the list. We can stick a reverse at the front, and this would solve the problem, but is somewhat unsatisfactory due to the double-traversal.

Then, if you go back to foldr, having realized that (basically) if you want transform a list whilst preserving order that foldr is the correct variant, you play with it for a while and end up writing what Joseph suggested. I do however agree with chi that straightforward recursion is the best solution here.

moonGoose
  • 1,510
  • 6
  • 14
  • [map + zip = scanl](https://stackoverflow.com/a/11951590/849891) and we can do a lot with scanl. – Will Ness May 05 '19 at 07:08
  • @WillNess lol I'm not sure how I feel about that link, but I stand by the claim that `map` is *unsuited* to the task, if not *unusable*. – moonGoose May 05 '19 at 07:27
  • my lesson from it was, that in Haskell, `[] a` is not an unordered collection of `a`s, and `zip` is just `map2 (,)`. – Will Ness May 05 '19 at 07:54
2

Your function can also be expressed as an unfold, or, more specifically, as an apomorphism. Allow me to begin with a brief explanatory note, before the solution itself.


The apomorphism is the recursion scheme dual to the paramorphism (see dfeuer's answer for more about the latter). Apomorphisms are examples of unfolds, which generate a structure from a seed. For instance, Data.List offers unfoldr, a list unfold.

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]

The function given to unfoldr takes a seed and either produces a list element and a new seed (if the maybe-value is a Just) or terminates the list generation (if it is Nothing). Unfolds are more generally expressed by the ana function from recursion-schemes ("ana" is short for "anamorphism").

ana :: Corecursive t => (a -> Base t a) -> a -> t

Specialised to lists, this becomes...

ana @[_] :: (b -> ListF a b) -> b -> [a]

... which is unfoldr in different clothing.

An apomorphism is an unfold in which the generation of the structure can be short-circuited at any point of the process, by producing, instead of a new seed, the rest of the structure in a fell swoop. In the case of lists, we have:

apo @[_] :: (b -> ListF a (Either [a] b)) -> b -> [a]

Either is used to trigger the short-circuit: with a Left result, the unfold short-circuits, while with a Right it proceeds normally.


The solution in terms of apo is fairly direct:

{-# LANGUAGE LambdaCase #-}

import Data.Functor.Foldable

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = apo go
  where
  go = \case
    [] -> Nil
    a : as
      | p a -> Cons a (Right as) 
      | otherwise -> case as of
        [] -> Nil
        b : bs -> Cons b (Left bs) 

It is somewhat more awkward than dfeuer's para-based solution, because if we want to short-circuit without an empty list for a tail we are compelled to emit one extra element (the b in the short-circuiting case), and so we have to look one position ahead. This awkwardness would grow by orders of magnitude if, rather than filterFirst, we were to impĺement plain old filter with an unfold, as beautifully explained in List filter using an anamorphism.

duplode
  • 33,731
  • 7
  • 79
  • 150
  • Your presentation is much better than mine. I may have to borrow some of it if I have time to revise. – dfeuer May 06 '19 at 14:30
  • 1
    @dfeuer I, in turn, have borrowed your `foldrWithTails` [for an answer elsewhere](https://stackoverflow.com/a/56032831/2751851). – duplode May 08 '19 at 02:46
2

This answer is inspired by a comment from luqui on a now-deleted question.

filterFirst can be implemented in a fairly direct way in terms of span:

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = (\(yeas, rest) -> yeas ++ drop 1 rest) . span p

span :: (a -> Bool) -> [a] -> ([a], [a]) splits the list in two at the first element for which the condition doesn't hold. After span, we drop the first element of the second part of the list (with drop 1 rather than tail so that we don't have to add a special case for []), and reassemble the list with (++).

As an aside, there is a near-pointfree spelling of this implementation which I find too pretty not to mention:

filterFirst :: (a -> Bool) -> [a] -> [a]
filterFirst p = uncurry (++) . second (drop 1) . span p

While span is a higher order function, it would be perfectly understandable if you found this implementation disappointing in the context of your question. After all, span is not much more fundamental than filterFirst itself. Shouldn't we try going a little deeper, to see if we can capture the spirit of this solution while expressing it as a fold, or as some other recursion scheme?

I believe functions like filterFirst can be fine demonstrations of hylomorphisms. A hylomorphism is an unfold (see my other answer for more on that) that generates an intermediate data structure followed by a fold which turns this data structure into something else. Though it might look like that would require two passes to get a result (one through the input structure, and another through the intermediate one), if the hylomorphism implemented properly (as done in the hylo function of recursion-schemes) it can be done in a single pass, with the fold consuming pieces of the intermediate structure as they are generated by the unfold (so that we don't have to actually build it all only to tear it down).

Before we start, here is the boilerplate needed to run what follows:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}

import Data.Functor.Foldable
import Data.Functor.Foldable.TH

The strategy here is picking an intermediate data structure for the hylomorphism that expresses the essence of what we want to achieve. In this case, we will use this cute thing:

data BrokenList a = Broken [a] | Unbroken a (BrokenList a)
    -- I won't actually use those instances here,
    -- but they are nice to have if you want to play with the type.
    deriving (Eq, Show, Functor, Foldable, Traversable)
makeBaseFunctor ''BrokenList 

BrokenList is very much like a list (Broken and Unbroken mirror [] and (:), while the makeBaseFunctor incantation generates a BrokenListF base functor analogous to ListF, with BrokenF and UnbrokenF constructors), except that it has another list attached at its end (the Broken constructor). It expresses, in a quite literal way, the idea of a list being divided in two parts.

With BrokenList at hand, we can write the hylomorphism. coalgSpan is the operation used for the unfold, and algWeld, the one used for the fold.

filterFirst p = hylo algWeld coalgSpan
    where
    coalgSpan = \case
        [] -> BrokenF []
        x : xs
            | p x -> UnbrokenF x xs
            | otherwise -> BrokenF xs
    algWeld = \case
        UnbrokenF x yeas -> x : yeas
        BrokenF rest -> rest

coalgSpan breaks the list upon hitting a x element such that p x doesn't hold. Not adding that element to the second part of the list (BrokenF xs rather than BrokenF (x : xs)) takes care of the filtering. As for algWeld, it is used to concatenate the two parts (it is very much like what we would use to implement (++) using cata).

(For a similar example of BrokenList in action, see the breakOn implementation in Note 5 of this older answer of mine. It suggests what it would take to implement span using this strategy.)

There are at least two good things about this hylo-based implementation. Firstly, it has good performance (casual testing suggests that, if compiled with optimisations, it is at least as good as, and possibly slightly faster than, the most efficient implementations in other answers here). Secondly, it reflects very closely your original, explicitly recursive implementation of filterFirst (or, at any rate, more closely than the fold-only and unfold-only implementations).

duplode
  • 33,731
  • 7
  • 79
  • 150
  • 1
    `BrokenList` is ["`People`"](https://wiki.haskell.org/Prime_numbers_miscellaneous#Implicit_Heap) too. :) – Will Ness May 03 '19 at 12:31
  • 1
    @WillNess Thanks, that is very interesting! Though I had tried to find prior art on `BrokenList`, this particular search string would never have occurred to me :) – duplode May 03 '19 at 16:37
  • [here's where it started](https://mail.haskell.org/pipermail/haskell-cafe/2007-July/029391.html). – Will Ness May 03 '19 at 18:47
  • I think it would pay to motivate the use of `hylo` by showing the unfused production and consumption of a `BrokenList`. It would also be nice to explain how using `BrokenList` is better (Easier to understand? Easier to modify somehow?) than a direct approach using `para` or raw pattern matching. – dfeuer May 06 '19 at 02:36
  • 1
    @dfeuer I'll look into doing that. As for what is better, the `para` solution is also fine, and explicit recursion isn't actually hard to follow in this specific case. Still, in spite of the relative verbosity I find the `hylo` solution very transparent. Maybe it's just an aesthetic feeling, but I find it very satisfying to see a datatype capturing an essential aspect of an algorithm like that. ([Designer monoid `foldMap` solutions](https://stackoverflow.com/a/55816153/2751851) make me happy in a similar way.) – duplode May 06 '19 at 03:20