Remove consecutive duplicates from an infinite list via folding?

Question

Consider one of these implementations of a function to remove consecutive duplicates from a list:

uniq :: Eq a => [a] -> [a]
uniq [] = []
uniq (x:xs) = x:uniq (dropWhile (x ==) xs)

uniq :: Eq a => [a] -> [a]
uniq (x:xs@(y:_))
 | x == y    = x:tail (uniq xs)
 | otherwise = x:uniq xs
uniq l = l

They both work as expected on both finite and infinite lists. To be more specific, for infinite lists, I expect take n $ uniq l to terminate as long as there are no infinitely long sequences of duplicate values before there are n values to return.

Now consider this attempt at such a function, using foldr:

uniq :: (Foldable t, Eq a) => t a -> [a]
uniq = foldr uniqHelper []
 where uniqHelper x [] = [x]
       uniqHelper x acc@(y:_)
        | x == y    =   acc
        | otherwise = x:acc

This works correctly on finite lists, but never terminates for infinite ones, because uniqHelper always needs to evaluate its second argument. Is this something that can be fixed with a more clever uniqHelper, or is it inherently impossible to use folding for this task?

By "works as expected", do you mean `uniq (let ones = 1:ones in ones)` never terminates? Your first one only terminates once it finds a pair of distinct adjacent elements. — chepner, Dec 08 '18 at 17:45
@chepner Good catch. I expect `head $ uniq $ repeat 1` to return `1` indeed. The accepted answer handles that correctly, though. — Joseph Sible-Reinstate Monica, Dec 08 '18 at 17:55
`foldl` should be better but you may also do like `map head . group` with infinite lists. — Redu, Dec 08 '18 at 18:36
Unlike `foldl'`,`foldl` is lazy so i suppose there shouldn't be a problem in terminating. — Redu, Dec 08 '18 at 18:46
@Redu It will never terminate. `foldl` needs the last element in the list to terminate, and infinite lists don't have a last element. (`map head . group` produces the right results and terminates correctly, but it's just a less efficient version of my first function above with `dropWhile`.) — Joseph Sible-Reinstate Monica, Dec 08 '18 at 18:47
Yes you are right about `foldl`. So i should correct my my comment to read as "`foldr` should be better"; but then what exactly is an infinite string part beats me. Do you practically receive an infinite stream of characters? They should be arriving in chunks. I would normally implement this functionality with `foldl'` eventhough handsome `map head . group` is not that inefficient. — Redu, Dec 08 '18 at 19:03

Willem Van Onsem · Accepted Answer · 2018-12-08T17:43:05.150

You can transfer the removal of an element to the tail, so regardless of the value, we first "yield" x, and then we use a function (here tl) to evaluate the tail of the list:

uniq :: (Foldable t, Eq a) => t a -> [a]
uniq = foldr uniqHelper []
 where uniqHelper x acc = x : tl acc
           where tl acc@(x2:xs) | x == x2 = xs
                                | otherwise = acc
                 tl [] = []

We thus first yield x, and then worry about the next element later. If second parameter (the folded list of the tail of the list) contains the same element, then we "remove" it, otherwise, we retain the entire list.

The above is also able to yield the first element of a repeat, for example:

Prelude> head (uniq (repeat 1))
1
Prelude> take 5 $ uniq [1..]
[1,2,3,4,5]
Prelude> uniq [1,1,1,1,2,2,2,1,1,1,1,1,1,3,3,3,3,1,1]
[1,2,1,3,1]

Of course if there are an infinite amount of 0s followed by a 1, that 1 is never emitted:

Prelude> uniq (repeat 0 ++ [1])
[0

Will Ness · Answer 2 · 2018-12-11T07:13:31.283

Your first snippet produces the first x out of many but the third snippet produces the last x, that's the reason for the discrepancy.

To faithfully render the first snippet as a right fold, we fold into functions so we can pass a state argument along, occasionally updated to the new unique element of the list:

uniq [] = []
uniq (x:xs) = x : foldr g (const []) xs x
  where
  g y r x | y==x  =     r x
      | otherwise = y : r y

This actually skips the duplicates instead of re-consing and then ignoring them over and over as the other two answers are doing, which are actually equivalent to each other: the dropWhile could only ever skip over no more than one element, which will then be skipped by the following reducer calling its dropWhile (==x).

I always use r as the name of the reducer's second argument, as a mnemonic device for "recursive result". Here the presence of yet another argument after r in g's definition means that r is a function, which we can pass an updated value serving as a state.

This technique enables using foldr to encode stateful computations like take, drop, dropWhile, takeWhile, etc..

> head $ uniq $ repeat 1
1

> take 10 $ uniq [n | n <- [1..], n <- replicate n n]
[1,2,3,4,5,6,7,8,9,10]

> uniq [1..10]
[1,2,3,4,5,6,7,8,9,10]

score 1 · Answer 3 · answered Dec 09 '18 at 02:13

1

You can just combine your implementations:

uniq :: Eq a => [a] -> [a]
uniq = foldr uniqHelper []
  where uniqHelper x acc = x : dropWhile (==x) acc

to get desired behaviour on infinite lists:

Prelude> take 5 $ uniq [1..]
[1,2,3,4,5]

answered Dec 09 '18 at 02:13

Ed'ka

6,595
29
30

`uniq = foldr uniq' [] where uniq' x acc = x : dropWhile (== x) acc` – galva Feb 22 '21 at 09:28

Remove consecutive duplicates from an infinite list via folding?

3 Answers3