1

I am wondering if there is a way to count the distinct elements in a list and group the counts into tuples, for example

[4,4,4,2,2,2,2,1,1,3]

or

[4,2,4,4,2,1,2,2,1,3]

would yield

[(4,3),(2,4),(1,2),(3,1)]

while preserving the order of the original list.

This question mentions preserving the order in the comments, but never addresses the issue.

Here is my attempt thus far:

import Data.List (nub)

countOccurance :: (Eq a) => a -> [a] -> Int
countOccurance x = length . filter (==x)

naiveCounter :: (Eq a) => [a] -> [(a, Int)]
naiveCounter l = map (\x -> (x, countOccurance x l)) $ nub l

but this seems quite inefficient. Is there a way to construct this more efficiently (for instance, by traversing the list only one time)?

Thanks.

Community
  • 1
  • 1
o-90
  • 17,045
  • 10
  • 39
  • 63
  • From my answer to the question you linked: "If you want to order the result by first occurrence, you can use zip before the sort to add an index to each element, then, after grouping, sort again by that index and then remove the index" – sepp2k Mar 12 '17 at 03:05
  • Hmmm.... I should read more carefully. Thanks :) – o-90 Mar 12 '17 at 03:11

3 Answers3

5

You could use Data.Map.Ordered.

import Data.Map.Ordered (OMap)
import qualified Data.Map.Ordered as OMap

-- insert L L With
--        ^ ^
--        |  `----- insert combined elements on the left of the sequence
--        `-------- insert new      elements on the left of the sequence
insertLLWith :: Ord k => (v -> v -> v) -> (k, v) -> OMap k v -> OMap k v
insertLLWith f (k, v) m = case OMap.lookup k m of
    Nothing -> (k, v) OMap.|< m
    Just v' -> (k, f v v') OMap.|< m

Armed with insertLLWith (which should probably go into the library with a few variants -- it seems generally useful!), we can write a fairly straightforward fold:

multisetFromList :: Ord a => [a] -> OMap a Int
multisetFromList = foldr (\x -> insertLLWith (+) (x, 1)) OMap.empty

In ghci:

> multisetFromList [4,4,4,2,2,2,2,1,1,3]
fromList [(4,3),(2,4),(1,2),(3,1)]
> multisetFromList [2,1,2] -- works with ungrouped lists, too
fromList [(2,2),(1,1)]
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • Thanks, this is a great solution, although I'd rather avoid maps for this particular problem. – o-90 Mar 12 '17 at 04:12
2

An alternative would be two right folds:

import Prelude hiding (lookup)
import Data.Map (empty, lookup, delete, insertWith)

count :: (Foldable t, Ord k, Num a) => t k -> [(k, a)]
count xs = foldr go (const []) xs $ foldr (\k -> insertWith (+) k 1) empty xs
  where
  go x f m = case lookup x m of
    Nothing -> f m
    Just i  -> (x, i): f (delete x m)

then,

\> count [4,2,4,4,2,1,2,2,1,3]
[(4,3),(2,4),(1,2),(3,1)]
behzad.nouri
  • 74,723
  • 18
  • 126
  • 124
1

As sepp2k commented, you could sort by index after grouping by element. I like to express this with generalized list comprehensions.

{-# LANGUAGE TransformListComp #-}
import GHC.Exts
countOccurrences :: (Ord a) => [a] -> [(a, Int)]
countOccurrences list =
    [ (the x, length x)
    | (i, x) <- zip [0..] list
    , then group by x using groupWith
    , then sortWith by minimum i
    ]

Other alternatives include updating the counter list as you go

countOccurrences :: (Eq a) => [a] -> [(a, Int)]
countOccurrences = foldl incrementCount [] where
    incrementCount [] x = [(x, 1)]
    incrementCount (count@(y, n):counts) x 
      | x == y = (y, n+1):counts
      | otherwise = count:incrementCount counts x 

or generating a list with all the partial counts, then filtering down to the final count

import Data.Function
import Data.List
countOccurrences :: (Eq a) => [a] -> [(a, Int)]
countOccurrences = nubBy ((==) `on` fst) . foldr addCount [] where
    addCount x counts = (x, maybe 1 succ $ lookup x counts) : counts

although neither is very efficient.

ephemient
  • 198,619
  • 38
  • 280
  • 391