One way to go about this is to update a piece of state as you traverse the list, similar to what you'd do in an imperative language. This requires working with the State
monad, which might take some studying and playing around to get it if it's your first time, but trust me, this is well worth learning. Let's start with imports:
import Control.Monad.State
import Data.Set (Set)
import qualified Data.Set as Set
The state we're going to keep is the Set
of elements seen up to that point in the list. So first, let's write a pair of simple State
actions to manage the set of seen elements:
-- Add an element to the context Set
remember :: Ord a => a -> State (Set a) ()
remember a = modify (Set.insert a)
-- Test if the context set contains an element
haveSeen :: Ord a => a -> State (Set a) Bool
haveSeen a = do seen <- get
return (a `Set.member` seen)
Now we're going to combine these two into an action that checks for duplication:
isDuplicate :: Ord a => a -> State (Set a) Bool
isDuplicate a = do seen <- haveSeen a
remember a
return seen
You've mentioned the takeWhile
function. We're going to build our solution along similar lines. This is takeWhile
's definition:
-- different name to avoid collision
takeWhile' :: (a -> Bool) -> [a] -> [a]
takeWhile' _ [] = []
takeWhile' p (a:as)
| p a = a : takeWhile p as
| otherwise = []
We can modify this function to work with a predicate that has the Bool
wrapped inside a monad:
takeWhileM :: Monad m => (a -> m Bool) -> [a] -> m [a]
takeWhileM _ [] = return []
takeWhileM p (a:as) =
do test <- p a
if test
then do rest <- takeWhileM p as
return (a:rest)
else return []
But the key difference here is that because the test in takeWhileM
is monadic, we can use our stateful isDuplicate
from above. So each time we test an element of the list with isDuplicate
, we will also record that element in the Set
that's being threaded through the computation. So now we can write takeUntilDuplicate
like this:
takeUntilDuplicate :: Ord a => [a] -> [a]
takeUntilDuplicate as = evalState (takeUntilDuplicate' as) Set.empty
where takeUntilDuplicate' :: Ord a => [a] -> State (Set a) [a]
takeUntilDuplicate' = takeWhileM (fmap not . isDuplicate)
Example use (with an infinite list argument):
>>> takeUntilDuplicate (cycle [1..5])
[1,2,3,4,5]
And what's neat is that several of these pieces of code could be reused for similar problems.