Breaking a list into sublists of a specified size using foldr

Question

I'm taking a functional programming class and I'm having a hard time leaving the OOP mindset behind and finding answers to a lot of my questions.

I have to create a function that takes an ordered list and converts it into specified size sublists using a variation of fold.

This isn't right, but it's what I have:

splitList :: (Ord a) => Int -> [a] -> [[a]]
splitList size xs
    | [condition] = foldr (\item subList -> item:subList) [] xs
    | otherwise =

I've been searching and I found out that foldr is the variation that works better for what I want, and I think I've understood how fold works, I just don't know how I'll set up the guards so that when length sublist == size haskell resets the accumulator and goes on to the next list.

If I didn't explain myself correctly, here's the result I want:

> splitList 3 [1..10]
> [[1,2,3],[4,5,6],[7,8,9],[10]]

Thanks!

Does this answer your question? [Transform a list into a list of lists with fold in haskell](https://stackoverflow.com/questions/64688081/transform-a-list-into-a-list-of-lists-with-fold-in-haskell) — 4castle, Nov 09 '20 at 14:40

radrow · Answer 1 · 2020-11-09T09:02:33.163

While Fabián's and chi's answers are entirely correct, there is actually an option to solve this puzzle using foldr. Consider the following code:

splitList :: Int -> [a] -> [[a]]
splitList n =
  foldr (\el acc -> case acc of
            []                      -> [[el]]
            (h : t) | length h < n  -> (el : h) : t
            _                       -> [el] : acc 
            ) []

The strategy here is to build up a list by extending its head as long as its length is lesser than desired. This solution has, however, two drawbacks:

It does something slightly different than in your example;

splitList 3 [1..10] produces [[1],[2,3,4],[5,6,7],[8,9,10]]
It's complexity is O(n * length l), as we measure length of up to n–sized list on each of the element which yields linear number of linear operations.

Let's first take care of first issue. In order to start counting at the beginning we need to traverse the list left–to–right, while foldr does it right–to–left. There is a common trick called "continuation passing" which will allow us to reverse the direction of the walk:

splitList :: Int -> [a] -> [[a]]
splitList n l = map reverse . reverse $
  foldr (\el cont acc ->
            case acc of
              [] -> cont [[el]]
              (h : t) | length h < n -> cont ((el : h) : t)
              _ -> cont ([el] : acc)
           ) id l []

Here, instead of building the list in the accumulator we build up a function that will transform the list in the right direction. See this question for details. The side effect is reversing the list so we need to counter that by reverse application to the whole list and all of its elements. This goes linearly and tail-recursively tho.

Now let's work on the performance issue. The problem was that the length is linear on casual lists. There are two solutions for this:

Use another structure that caches length for a constant time access
Cache the value by ourselves

Because I guess it is a list exercise, let's go for the latter option:

splitList :: Int -> [a] -> [[a]]
splitList n l = map reverse . reverse . snd $
  foldr (\el cont (countAcc, listAcc) ->
            case listAcc of
              []                     -> cont (countAcc, [[el]])
              (h : t) | countAcc < n -> cont (countAcc + 1, (el : h) : t)
              (h : t)                -> cont (1, [el] : (h : t))
           ) id l (1, [])

Here we extend our computational state with a counter that at each points stores the current length of the list. This gives us a constant check on each element and results in linear time complexity in the end.

[here's](https://stackoverflow.com/a/64703257/849891) a `foldr` solution, that works for infinite lists as well. — Will Ness, Nov 09 '20 at 12:16
thank you. I'm pretty sure this was already asked and answered (including by me) several times on SO, but can't quite put my finger on it... — Will Ness, Nov 09 '20 at 12:46
Hey @WillNess, doesn't that mean this question is a duplicate? — Fabián Heredia Montiel, Nov 10 '20 at 06:52
@FabiánHerediaMontiel it looks very close to that, but since I've posted answers on both of them I'm refraining from voting on this issue. — Will Ness, Nov 10 '20 at 08:00

score 1 · Answer 2 · answered Nov 09 '20 at 04:20

A way to simplify this problem would be to split this into multiple functions. There are two things you need to do:

take n elements from the list, and
keep taking from the list as much as possible.

Lets try taking first:

taking :: Int -> [a] -> [a]
taking n [] = undefined
taking n (x:xs) = undefined

If there are no elemensts then we cannot take any more elements so we can only return an empty list, on the other hand if we do have an element then we can think of taking n (x:xs) as x : taking (n-1) xs, we would only need to check that n > 0.

taking n (x:xs)
  | n > 0 = x :taking (n-1) xs
  | otherwise = []

Now, we need to do that multiple times with the remainder so we should probably also return whatever remains from taking n elements from a list, in this case it would be whatever remains when n = 0 so we could try to adapt it to

  | otherwise = ([], x:xs)

and then you would need to modify the type signature to return ([a], [a]) and the other 2 definitions to ensure you do return whatever remained after taking n.

With this approach your splitList would look like:

splitList n [] = []
splitList n l = chunk : splitList n remainder
  where (chunk, remainder) = taking n l

Note however that folding would not be appropriate since it "flattens" whatever you are working on, for example given a [Int] you could fold to produce a sum which would be an Int. (foldr :: (a -> b -> b) -> b -> [a] -> b or "foldr function zero list produces an element of the function return type")

I sincerely thank you for your answer! It's actually mandatory that I use a variation of fold to solve this problem, it's exactly because of your last paragraph that I've been scratching my head around this question. I'm very new to Haskell and I'm having a hard time understanding the purpose of your functions through reading the code, could you tell me what each one does? — StrunKzZ, Nov 09 '20 at 04:35
Would you like me to elaborate or clarify something in particular? — Fabián Heredia Montiel, Nov 09 '20 at 04:38

chi · Answer 3 · 2020-11-09T08:48:37.027

You want:

splitList 3 [1..10]
> [[1,2,3],[4,5,6],[7,8,9],[10]]

Since the "remainder" [10] in on the tail, I recommend you use foldl instead. E.g.

splitList :: (Ord a) => Int -> [a] -> [[a]]
splitList size xs
    | size > 0 = foldl go [] xs
    | otherwise = error "need a positive size"
    where go acc x = ....

What should go do? Essentially, on your example, we must have:

splitList 3 [1..10] 
= go (splitList 3 [1..9]) 10
= go [[1,2,3],[4,5,6],[7,8,9]] 10
= [[1,2,3],[4,5,6],[7,8,9],[10]]

splitList 3 [1..9] 
= go (splitList 3 [1..8]) 9
= go [[1,2,3],[4,5,6],[7,8]] 9
= [[1,2,3],[4,5,6],[7,8,9]]

splitList 3 [1..8] 
= go (splitList 3 [1..7]) 8
= go [[1,2,3],[4,5,6],[7]] 8
= [[1,2,3],[4,5,6],[7,8]]

and

splitList 3 [1] 
= go [] 1
= [[1]]

Hence, go acc x should

check if acc is empty, if so, produce a singleton list [[x]].
otherwise, check the last list in acc:
- if its length is less than size, append x
- otherwise, append a new list [x] to acc

Try doing this by hand on your example to understand all the cases.

This will not be efficient, but it will work.

You don't really need the Ord a constraint.

score 1 · Answer 4 · answered Nov 09 '20 at 12:50

Checking the accumulator's first sublist's length would lead to information flow from the right and the first chunk ending up the shorter one, potentially, instead of the last. Such function won't work on infinite lists either (not to mention the foldl-based variants).

A standard way to arrange for the information flow from the left with foldr is using an additional argument. The general scheme is

subLists n xs  =  foldr g z xs n
  where
  g x r i  =  cons x i (r (i-1))
  ....

The i argument to cons will guide its decision as to where to add the current element into. The i-1 decrements the counter on the way forward from the left, instead of on the way back from the right. z must have the same type as r and as the foldr itself as a whole, so,

  z _  =  [[]]

This means there must be a post-processing step, and some edge cases must be handled as well,

subLists n xs  =  post . foldr g z xs $ n
  where
  z _  =  [[]]
  g x r i | i == 1  =  cons x i (r n)
  g x r i           =  cons x i (r (i-1))
  ....

cons must be lazy enough not to force the results of the recursive call prematurely.

I leave it as an exercise finishing this up.

For a simpler version with a pre-processing step instead, see this recent answer of mine.

score 1 · Answer 5 · edited Nov 13 '20 at 06:48

Just going to give another answer: this is quite similar to trying to write groupBy as a fold, and actually has a couple gotchas w.r.t. laziness that you have to bear in mind for an efficient and correct implementation. The following is the fastest version I found that maintains all the relevant laziness properties:

splitList :: Int -> [a] -> [[a]]
splitList m xs = snd (foldr f (const ([],[])) xs 1)
  where
    f x a i
      | i <= 1    = let (ys,zs) = a m in ([], (x : ys) : zs)
      | otherwise = let (ys,zs) = a (i-1) in  (x : ys  , zs)

The ys and the zs gotten from the recursive processing of the rest of list indicate the first and the rest of the groups into which the rest of the list will be broken up, by said recursive processing. So we either prepend the current element before that first subgroup if it is still shorter than needed, or we prepend before the first subgroup when it is just right and start a new, empty subgroup.

I've edited with some help for beginners; feel free to revert of course and pardon the intrusion :) (I also took the liberty to rename `n` to `i`, for better mental / visual separation from `m`.) — Will Ness, Nov 13 '20 at 06:53

Breaking a list into sublists of a specified size using foldr

5 Answers5