6

I need to pack data like this:

let data = [1; 2; 2; 3; 2; 2; 2; 4]
let packed = [(1, 1); (2, 2); (3, 1); (2, 3); (4, 1)]

Where each item say how much times it exist before the next. However, it must work with non-adjacent duplications.

I can work this with classical imperative code, but wonder how do this functionally.

Also, Seq.countBy not work because it take in account all the values

Mark Seemann
  • 225,310
  • 48
  • 427
  • 736
mamcx
  • 15,916
  • 26
  • 101
  • 189

3 Answers3

8

If you already have an imperative version, you can follow a set of small steps to refector to a recursive implementation.

Recursion

While I don't know what your imperative version looks like, here's a recursive version:

let pack xs =
    let rec imp acc = function
    | [] -> acc
    | h::t ->
        match acc with
        | [] -> imp [(h, 1)] t
        | (i, count) :: ta ->
            if h = i
            then imp ((i, count + 1) :: ta) t
            else imp ((h, 1) :: (i, count) :: ta) t
    xs |> imp [] |> List.rev

This function has the type 'a list -> ('a * int) list when 'a : equality. It uses a private 'implementation function' called imp to do the work. This function is recursive, and threads an accumulator (called acc) throughout. This accumulator is the result list, having the type ('a * int) list.

If the accumulator list is empty, the head of the original list (h), as well as the count 1, is created as a tuple as the only element of the updated accumulator, and the imp function is recursively called with that updated accumulator.

If the accumulator already contains at least one element, the element is extracted via pattern matching, and the element in that tuple (i) is compared to h. If h = i, the accumulator is updated; otherwise, a new tuple is consed on acc. In both cases, though, imp is recursively called with the new accumulator.

You can call it with a list equivalent to your original tuple like this:

> pack [1; 2; 2; 3; 2; 2; 2; 4];;
val it : (int * int) list = [(1, 1); (2, 2); (3, 1); (2, 3); (4, 1)]

Fold

Once you have a recursive version, you often have the recipe for a version using a fold. In this case, since the above pack function has to reverse the accumulator in the end (using List.rev), a right fold is most appropriate. In F#, this is done with the built-in List.foldBack function:

let pack' xs =
    let imp x = function
        | (i, count) :: ta when i = x -> (i, count + 1) :: ta
        | ta -> (x, 1) :: ta
    List.foldBack imp xs []

In this case, the function passed to List.foldBack is a bit too complex to pass as an anonymous function, so I chose to define it as a private inner function. It's equivalent to the recursive imp function used by the above pack function, but you'll notive that it doesn't have to call itself recursively. Instead, it just has to return the new value for the accumulator.

The result is the same:

> pack' [1; 2; 2; 3; 2; 2; 2; 4];;
val it : (int * int) list = [(1, 1); (2, 2); (3, 1); (2, 3); (4, 1)]
Mark Seemann
  • 225,310
  • 48
  • 427
  • 736
  • 2
    Nice answer. A few things small changes can make it even clearer, in my opinion. You can replace `let imp x acc = match acc ...` with `let imp x = function ...` to avoid naming an identifier that you just immediately destructure. More importantly, you can switch the order of the pattern match cases and use a `when` clause to unify the `[]` and `x <> i` cases: `| (i, count) :: ta when i = x -> (i, count+1) :: ta | ta -> (x, 1) :: ta`. – kvb Dec 28 '15 at 15:07
  • @kvb Those are very nice simplifications! Thank you. Answer updated. – Mark Seemann Dec 28 '15 at 15:16
2

My solution assumes the data collection is a list. If having it as a tuple (as per your example) was intentional then for my solution to work the tuple has to be converted to a list (an example how to do it can be found here).

let groupFunc list = 
    let rec groupFuncRec acc lst init count =
        match lst with
        | [] -> List.rev acc
        | head::[] when head = init
            -> groupFuncRec ((init, count)::acc) [] 0 0
        | head::[] when head <> init
            -> groupFuncRec ((head, 1)::acc) [] 0 0
        | head::tail when head = init 
            -> groupFuncRec acc tail head (count+1)
        | head::tail when head <> init
            -> groupFuncRec ((init, count)::acc) tail head 1
    let t = List.tail list
    let h = List.head list
    groupFuncRec [] t h 1

When I run the function on your sample data I get back the expected result:

 list = [(1, 1); (2, 2); (3, 1); (4, 1)]
Community
  • 1
  • 1
PiotrWolkowski
  • 8,408
  • 6
  • 48
  • 68
2

You can get Seq.countBy to work by including some positional information in its argument. Of course, you need then to map back to your original data.

[1; 2; 2; 3; 2; 2; 2; 4]
|> Seq.scan (fun (s, i) x ->
    match s with
    | Some p when p = x -> Some x, i
    | _ -> Some x, i + 1 ) (None, 0)
|> Seq.countBy id
|> Seq.choose (function 
| (Some t, _), n -> Some(t, n)
| _ -> None )
|> Seq.toList
// val it : (int * int) list = [(1, 1); (2, 2); (3, 1); (2, 3); (4, 1)]
kaefer
  • 5,491
  • 1
  • 15
  • 20