3

I have a seq of seqs in FSharp. I want to join a seq to the previous one if a predicate returns to true for it.

Sample:

let items = seq [seq[2;3;4];seq[1;5;6;7;1;9];seq[2;3;5;7]]

I want to join a seq to the previos one, if the seq starts by 1, so the result should be in this case:

seq [seq[2;3;4;1;5;6;7;1;9];seq[2;3;5;7]]

Is there any nice functional way to do it?

I am just started to translate my long computation processes from C# to F# and very impressed by the performance improvement I could achieve after even a very few hours of work and my beginner level knowledge of FSharp.

I have bought a book from Amazon entitled 'Beginning F#'. It is really great, but I mainly should work with seqs, lists, maps, collections now and this topic isn't explained as detailed as I need. Would anyone be so kind to advise me a good resource about ths topics?

Thx in advance!

Tom
  • 3,899
  • 22
  • 78
  • 137

4 Answers4

3
let joinBy f input =
  let i = ref 0
  input 
  |> Seq.groupBy (fun x ->
    if not (f x) then incr i
    !i)
  |> Seq.map (snd >> Seq.concat)

joinBy (Seq.head >> ((=) 1)) items
Daniel
  • 47,404
  • 11
  • 101
  • 179
  • It has a nice symmetry with my [answer](http://stackoverflow.com/questions/6736464/split-seq-in-f/6737659#6737659) to his previous question. – Daniel Jul 19 '11 at 14:26
  • +1 for nice and shorter solution :-). While I like the fact that it's shorter, it relies on the fact that `Seq.groupBy` is implmented in certain way. That's definitely not a problem (as long as there are tests for it). However, I try to avoid mutable variables in arguments to declarative functions like `Seq.groupBy` to keep the nice property that you can replace `Seq.groupBy` with e.g. parallel implementation and it'll still work. (that is, I'd say using impure functions as arguments for `Seq` combinators is "cheating", but cheating is often fine). – Tomas Petricek Jul 19 '11 at 14:38
  • @Tomas: Good point. It depends on `groupBy` being sequential, but that's a pretty safe assumption given that there's a separate module, `PSeq`, devoted to "non-sequential" execution, right? – Daniel Jul 19 '11 at 14:43
  • @Tom: In response to Tomas's point: this relies on the sequential behavior of `groupBy`. If this makes you feel queasy, it's trivial to roll your own. [Here](http://stackoverflow.com/questions/6726559/group-totals-in-f-easy-with-sequences-is-it-possible-with-lists/6728296#6728296)'s one way to do it (which, by the way, is very similar to how it's done in the F# [sources](https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/seq.fs#L1248-1271)). – Daniel Jul 19 '11 at 14:54
  • @Daniel - yes, I think it is a safe assumption for `Seq` (I don't think someone may ever want to change it to non-sequential). I think it is mostly stylistic or philosophical thing. – Tomas Petricek Jul 19 '11 at 17:10
  • @Tomas: I suspect it is a safe assumption because the F# team will not want to break code that relies upon undefined behaviour but I don't like it. – J D Jul 20 '11 at 11:37
  • Doesn't this also rely upon the hash function used for the `int` keys by the `groupBy` function? That happens to be the identity function at the moment but changing to a different hash function will cause the subsequences in the output to be emitted in an effectively random order. – J D Jul 20 '11 at 11:48
  • @Jon: I don't see how it would reorder the subsequences. It seems it would affect the outer sequence, but that's an easy fix: sort the output of `groupBy` by key, prior to calling `Seq.map`. However, `int` hashes are very unlikely to change. What int could more uniquely identify an int than itself? – Daniel Jul 20 '11 at 13:58
  • @Jon: Are you saying iterative traversal of a sequence is "undefined behavior"? – Daniel Jul 20 '11 at 14:05
2

As with your last question, there is no library function that does exactly this. The most straightforward solution is to write this imperatively using IEnumerator. However, you can write a more generally useful function (that can then be used for other purposes too).

module Seq =
  /// Iterates over elements of the input sequence and groups adjacent elements.
  /// A new group is started when the specified predicate holds about the element
  /// of the sequence (and at the beginning of the iteration).
  /// For example: 
  ///    Seq.groupWhen isOdd [3;3;2;4;1;2] = seq [[3]; [3; 2; 4]; [1; 2]]
  let groupWhen f (input:seq<_>) = seq {
    use en = input.GetEnumerator()
    let running = ref true

    // Generate a group starting with the current element. Stops generating
    // when it founds element such that 'f en.Current' is 'true'
    let rec group() = 
      [ yield en.Current
        if en.MoveNext() then
          if not (f en.Current) then yield! group() 
        else running := false ]

    if en.MoveNext() then
      // While there are still elements, start a new group
      while running.Value do
        yield group() }

To solve the original problem, you can then check whether the first element of the sequence is a number other than 1. You'll get a sequence of groups where a group is sequence of sequences - then you can just concatenate the groups:

items 
  |> Seq.groupWhen (fun s -> Seq.head s <> 1)
  |> Seq.map Seq.concat

EDIT: I also posted the function as a snippet (with nice F# formatting) here: http://fssnip.net/6A

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • this doesn't appear to work, it returns the same output as input (using the OP's example) – Stephen Swensen Jul 19 '11 at 14:17
  • @Stephen - Are you sure? I just copied the code from SO to Visual Studio and it worked fine... The result for OP's input is `[[2; 3; 4; 1; 5; 6; 7; 1; 9]; [2; 3; 5; 7]]` (when converted to lists). – Tomas Petricek Jul 19 '11 at 14:59
  • Ooops - apparently I was wrong and it does work correctly. I triple checked before I posted my comment so I don't know why I was getting bad results before (maybe I had some weird FSI cache or something since I was playing around with all the different solutions, testing again just now I first reset my session). – Stephen Swensen Jul 19 '11 at 15:06
  • @Stephen - No problem. Glad to hear it's actually working :-). – Tomas Petricek Jul 19 '11 at 17:07
  • Yeah, sorry about that, I can imagine that it may have stalled your up votes :( – Stephen Swensen Jul 19 '11 at 17:26
  • @Stephen - No worries! Daniel's solution deserves to be the first :-). – Tomas Petricek Jul 20 '11 at 12:39
2

As seen in the other solutions, this problem is almost the inverse of your last question. So for good measure, I give a modified version of my answer to that here:

let concatWithPreviousWhen f s = seq {
    let buffer = ResizeArray()

    let flush() = seq { 
        if buffer.Count > 0 then 
            yield Seq.readonly (buffer.ToArray())
            buffer.Clear() }

    for subseq in s do
        if f subseq |> not then yield! flush()
        buffer.AddRange(subseq)

    yield! flush() }

And you use it like so:

seq [seq[2;3;4];seq[1;5;6;7;1;9];seq[2;3;5;7]]
|> concatWithPreviousWhen (Seq.head>>(=)1)
Community
  • 1
  • 1
Stephen Swensen
  • 22,107
  • 9
  • 81
  • 136
1

Looks like a fold to me as shown below. Tried to be as functional as possible without ref values.

let joinBy f (s:'a seq seq) = 
    let (a:'a seq), (b:'a seq seq) = 
        s |> Seq.fold (fun (a,r) se -> 
                         if f se then (se |> Seq.append a,r) 
                         else (se, seq {yield! r; yield a} ) ) 
             (Seq.empty, Seq.empty)
    seq {yield! b; yield a} |> Seq.filter (Seq.isEmpty >> not)


seq [seq[2;3;4];seq[1;5;6;7;1;9];seq[2;3;5;7]]
|> joinBy (Seq.head >> ((=) 1))
|> printfn "%A"
Ankur
  • 33,367
  • 2
  • 46
  • 72