2

I want to split a string into a list of strings of given length using fold.

split :: n -> String -> [String]
split 4 "abcdefghijklmnopqrst" -> ["abcd","efgh","ijkl","mnop","qrst"]

Can this be done using foldr? Can you provide an explanation so I could gain some intuition on how to use folds to solve problems like this?

Edit: I have added the recursive solution

split :: Int -> String -> [String]
split len xs
  | len >= length xs = [xs]
  | otherwise = take len xs : split len (drop len xs)
Will Ness
  • 70,110
  • 9
  • 98
  • 181
Aritra Dattagupta
  • 760
  • 1
  • 7
  • 22
  • 1
    What did you try, what is not working? – Willem Van Onsem Jun 25 '21 at 21:02
  • To add to Willems question, are you able to write `split` as a recursive function without `foldr`? In some cases it is pretty easy to go from the recursive definition to a definition that uses `foldr`. It will also probably be easier to explain it in terms of code that you know. – Noughtmare Jun 25 '21 at 21:23
  • I have just edited my question to add the recursive solution. How do I go from this to foldr? – Aritra Dattagupta Jun 25 '21 at 21:40
  • 1
    This kind of recursive function is probably not too useful for converting to a fold. You want natural recursion that takes list elements one by one. This normally involves pattern-matching on things like `[]` and `x:xs`. – n. m. could be an AI Jun 25 '21 at 22:32
  • I recommend having a look at https://wiki.haskell.org/Fold#List_folds_as_structural_transformations. Even if you doesn't get to the answer, at least it will help you work out something towards it. – pedrofurla Jun 25 '21 at 23:00
  • 2
    Although I agree with @n.1.8e9-where's-my-sharem., I do think this kind of problem is a good exercise. – pedrofurla Jun 25 '21 at 23:01
  • 2
    Frame challenge: *why* do you want to use foldr? No seasoned Haskell programmer would do that, even in the counterfactual world where they didn't just import a function that did this. – Daniel Wagner Jun 25 '21 at 23:56
  • 2
    @danielwagner I just wanted to do this using foldr as I am new to my haskell and wanted to know more how folding really works. There is no necessity to do this using folds as this is a function which I would need to solve a bigger problem which is the main goal of the exercise I am trying to solve. I was just curious what the solution would be using foldr. – Aritra Dattagupta Jun 26 '21 at 00:04
  • Just notice how simply this can be written using `unfoldr` from `Data.List`: `split len = unfoldr $ fmap (NE.splitAt len) . NE.nonEmpty`. `NE` is `Data.List.NonEmpty`. – peter pun Jun 29 '21 at 03:11

2 Answers2

2

To go from your recursive code to foldr you need to further inline and fuse the definitions of length, take and drop so that your code deconstructs the input list one element at a time, like foldr does:

split :: Int -> [a] -> [[a]]
split len xs
  | len >= length xs = [xs]
  | otherwise = take len xs : split len (drop len xs)
  where
  take :: Int -> [a] -> [a]
  take 0 xs = xs
  take 0 [] = []
  take n (x:xs) = x : take (n-1) xs
  -- and for `drop` and `len >= length` as well
  -- ......

So we need to add a counter for that.

split2 :: Int -> [a] -> [[a]]
split2 len xs = go xs 0
  where
  go [] _  = [[]]    -- [] : []
  go (x:xs) i 
    | i < len = [x] : go xs (i+1)
    | otherwise =     go xs 0

(do fix the off-by-one error(s) here, would you).

This of course doesn't produce the right results but it already follows the foldr pattern! It does the counting and the skipping along all at once, and correctly chooses the point to restart counting from 0 which corresponds to the recursive call with the dropped list -- already done for free, all by just going along the input list one element at a time -- like foldr does. We just need to save those elements we currently forget:

split3 :: Int -> [a] -> [[a]]
split3 len xs = go xs 0 []
  where
  go [] _ ys = [reverse ys]
  go (x:xs) i ys
    | i < len = go xs (i+1) (x:ys)
    | otherwise = reverse (x:ys) : go xs 0 []

This directly maps onto the foldr recursion pattern, where we replace go (x:xs) on the LHS with g x r and go xs on the RHS with r:

split4 :: Int -> [a] -> [[a]]
split4 len xs = foldr g z xs 0 []
  where
  z _ ys = [reverse ys]
  g x r i ys 
    | i < len = r (i+1) (x:ys)
    | otherwise = reverse (x:ys) : r 0 []

-- > split4 4 "123456"
-- ["12345","6"]

Again, do fix the off-by-one error, please.

This isn't fully lazy, though, i.e. (!! 0) . head . split4 3 $ "12"++undefined will diverge instead of returning the character '1' as it arguably should.

To achieve that, we must return the structure right away, even if with holes in it, and fill those holes later:

split5 :: Int -> [a] -> [[a]]
split5 len xs = foldr g z xs 1
  where
  z _ = [[]]
  g x r i
    | i < len = let (a:b) = r (i+1) in (x:a):b
    | otherwise = [x] : r 1

With this,

> split5 4 "123456"
["1234","56"]

> take 2 . head . split5 4 $ "12" ++ undefined
"12"

Still there's a corner case here where the return value isn't quite right. Do find it, and try to fix it, too.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • Really nice approach! To solve the problem of one trailing `[]` we can make `z` return `[]` when its argument is `1`. However, it still behaves differently from the OP's recursive solution (and from my anamorphic one) when `len` is `0`. Any ideas on how/if this can be made to work like my solution in this case, without treating it separately (and writing `repeat []`)? – peter pun Jun 29 '21 at 03:52
  • @peterpun where is this solution posted that you mention? in general, if some special handling is required for some special cases that the general code isn't handling well, we'll just out it through some pre/post/processing step, I imagine. – Will Ness Jun 29 '21 at 07:57
  • *put it through ---- of course this problem is a natural fit for unfoldr, as also mentioned in the comments. making it work through foldr is just an exercise. – Will Ness Jun 29 '21 at 08:42
  • Sorry, it's in the question's comments. I was just curious whether we can completely simulate that anamorphism with a catamorphism. Of course, the case where `len` is `0` is special because then the parts cannot at the same time have length at most `len` and produce the input when concatenated (when the input is non-empty). – peter pun Jun 29 '21 at 15:19
  • there was a question about how post/pre/processing step is unavoidable in some similar situation, I think, also involving foldr, IIRC, with an answer by chi. [here it is](https://stackoverflow.com/a/61496536/849891) (see if it's relevant :) ) – Will Ness Jun 29 '21 at 17:59
0

As others commented, using fold is probably not the best approach here. Nevertheless it is an interesting exercise.

From the signature of foldr we can figure out that the accumulator needs to be of type [String] and the combine operator of type Char -> [String] -> [String].

To implement combine, notice that we can append character by character to the accumulator. In the process we need to take care to start with a new string every time we exceed n characters:

split :: Int -> String -> [String]
split n = foldr combine []
    where
        combine c (x:xs) | length x < n = (c:x):xs
        combine c xs = [c]:xs
michid
  • 10,536
  • 3
  • 32
  • 59
  • 2
    this isn't lazy, won't work with infinite lists; also, `split 4 "123456" == ["12","3456"]` instead of `["1234","56"]` which is produced by the OP's recursive code. – Will Ness Jun 26 '21 at 13:07